Something to understand I completely have gradually sink to the lowest level, up to iron. But you should start with the top - level of our application.
So, we write on our favorite high-level language, no matter JS/Rust/C#/Scala/Python, or any other. In today's world we likely to have any abstraction to deal with asynchronous API provided by a standard library or language or external libraries. It can be primitive and is based on kolbach or more advanced, like Future/Promise/Task or something like that. Sometimes our language provides a syntax similar to async/await to work with these abstractions, and sometimes asynchronous work can generally be hidden away in the runtime language, such as with goroutines in Go. But in any case, somewhere under the hood we will have an event-loop, and sometimes none, as nobody forbids to write to us mnogopotochnoy at the same time, using asynchronous calls.
He event-loop is not more than the usual while(true) or any other infinite loop. And within this cycle, our program has access to retrieve some queue (if you don't know what kind of data structure, then do a Google search), which contains the results of already processed tasks. The program takes the next result finds waiting for her colbeck/Promise/Future/Task and starts executing the pending code. Queues, again, can be several and they can be processed in different ways, but it doesn't matter. The important thing is that our main stream (or streams) do not know anything about how to run asynchronous tasks. He just looks whether there is in queue the result and if there is, process it, and if not, then accept the decision or to exit the loop (and to terminate a thread, and sometimes the entire process) or sleep until new results appear.
But how in line are the results? You have to understand that asynchronous program is almost always multi-threaded and the result of operations into queue of background threads that are just blocked waiting for the desired resource (or many resources, if they use system API like epoll or kqueue). Usually these background threads most of the time are in the standby state, and therefore do not consume CPU resources and is not stored in the scheduler of the OS. Such a simple model indeed allows to save resources compared to a model where multiple threads perform 1 task independently and expect their queries.
It is important to note that in today's world, even on mid-level languages like C or C++, not to mention a high level, do not implement asynchronous themselves. Firstly, different operating systems are used for different IPAS. Secondly, the API on different operating systems are able to handle different resource types (network sort of know how to work all of the major OS, but in addition to the network asynchronously, you can work with user input, disk and peripherals like scanners, Webcams and other touches in usb). The most popular (IMHO) is a cross-platform library libuv, although Rust is customary to use a mio (or even of abstraction above it, like tokio), in C# there are similar mechanisms in .NET Core, and in Go it's already sewn
in the 1.5 MB of the runtime that Go to Italy each executable
(there really is still GC, but this is one FIC a lot and is worthy of adoption in the dynamic Libu)
OK. With application code like understood. What is happening in the core OS? Because, as stated above, we even have API to wait for requests pack. It's simple. The kernel become asynchronous even before it became mainstream, if we're not dealing with real time OS (but we have the same Windows/Lin/Mac/Praha, not the OS for the onboard computer of the Boeing where it is critical). See, when something happens on the external periphery (well, for example, a disk read requested data or network data came, or the user pulled the mouse), then generates an interrupt. The CPU really stops his current work and ran to see what happened, or rather calls the handler provided by the OS. But the OS that is the main work, so she is trying to release handler and simply discards all data in the RAM, and will understand then, when the turn comes. Nothing like? Very similar to what occurred in the event loop, but instead of background threads "results" are queued from interrupt. And when then the OS will give the data to the device driver, well, etc. until they reach our application. That's all, no magic.