the physics world and related elements, player inputs
You forgot the third vitally important part of a game, taking the game state and presenting it to the user.
Looking something up in a local data structure takes microseconds at worst (and less if it's in CPU cache), looking something up over an internet connection likely takes milliseconds at best and more likely tens or hundreds of miliseconds, on a bad day I've seen latencys get up into the seconds. A frame is of the order of 20 milliseconds, so on a good day you might get tens to round trips per frame and on a bad day you might be waiting numerous frames for a round trip.
So you can't have a client render a frame by making a bunch of synchronous requests for information from the server. Even a single request may receive a response several frames after it was sent. So generally whatever you choose to send must be sent in a largely asynchronous manner.
At one extreme you could use a "remote play" approach, the client sends requests to the server and gets a video stream back. All the chatty code sits on the server. You could also do a variant of this where, instead of sending a video stream you sent a series of drawing commands.
This is very secure and potentially allows a lightweight client, but it needs both a powerful server and a fast, stable low-latency network. It's usually ruled impractical for these reasons.
At the other extreme is the fully synchronised game state approach. Player (and sometimes AI) inputs are distributed to all the clients, which then maintain their own copies of the complete game state. This keeps network traffic to a minimum and means the server can skip UI code and handle only game state. Sometimes you can do this without even having a "server" at all.
But it creates a few problems.
- The logic must operate exactly the same on all clients, this can be easier said than done, some code can behave differently on different systems. Floating point arithmetic was historically a problem, though afaict it's less of an issue nowadays, care must be taken to ensure that inputs are applied consistently on all clients and that random number generation is done in a synchronised manner. Synchronised games will often have two separate RNGs, one for use by the game state, and a separate one used in (non-synced) display code..
- It enables cheating, a modified game client can easily be made to reveal information that the player should not have.
- It can make bringing new clients into an existing game difficult. Games that use this technique (for example, openttd and factorio) often have significant pauses of the game when a new client connects.
- Because all systems must process inputs at the same time, a slow network can still be unacceptable latency between player input and response on screen.
So many games end up with a hybrid approach, the server has a complete copy of the game state, clients have partial copies, for which the server sends updates in an asynchronous manner. The client then uses this partial copy of the game state to render the game.
Most of the simulation happens on the server, but some parts may be pushed to the client. In particular in first/third person 3D games the player character's movement is often simulated on the client and only later sent to the server. The server must then try to reconcile what happened on the client with what is happening on the server. Doing this in a way that minimises the scope for cheating and minimises noticeable artificts, while also hiding as much of the latency as possible from users is probably one of the biggest challenges in competitive shooters.