| Hossein | dc984e7 | 2021-10-20 10:55:25 +0200 | [diff] [blame] | 1 | # VCL Scheduler |
| 2 | |
| 3 | ## Introduction |
| Jan-Marek Glogowski | fa1be53 | 2017-08-04 13:46:44 +0200 | [diff] [blame] | 4 | |
| 5 | The VCL scheduler handles LOs primary event queue. It is simple by design, |
| 6 | currently just a single-linked list, processed in list-order by priority |
| Andrea Gelmini | 64a3124 | 2017-08-17 16:41:20 +0200 | [diff] [blame] | 7 | using round-robin for reoccurring tasks. |
| Jan-Marek Glogowski | fa1be53 | 2017-08-04 13:46:44 +0200 | [diff] [blame] | 8 | |
| 9 | The scheduler has the following behaviour: |
| 10 | |
| 11 | B.1. Tasks are scheduled just priority based |
| Hossein | 67615c8 | 2022-07-30 20:30:19 +0200 | [diff] [blame] | 12 | |
| Jan-Marek Glogowski | fa1be53 | 2017-08-04 13:46:44 +0200 | [diff] [blame] | 13 | B.2. Implicitly cooperative AKA non-preemptive |
| Hossein | 67615c8 | 2022-07-30 20:30:19 +0200 | [diff] [blame] | 14 | |
| Jan-Marek Glogowski | fa1be53 | 2017-08-04 13:46:44 +0200 | [diff] [blame] | 15 | B.3. It's not "fair" in any way (a consequence of B.2) |
| Hossein | 67615c8 | 2022-07-30 20:30:19 +0200 | [diff] [blame] | 16 | |
| Jan-Marek Glogowski | fa1be53 | 2017-08-04 13:46:44 +0200 | [diff] [blame] | 17 | B.4. Tasks are handled round-robin (per priority) |
| Hossein | 67615c8 | 2022-07-30 20:30:19 +0200 | [diff] [blame] | 18 | |
| Jan-Marek Glogowski | fa1be53 | 2017-08-04 13:46:44 +0200 | [diff] [blame] | 19 | B.5. Higher priorities have lower values |
| Hossein | 67615c8 | 2022-07-30 20:30:19 +0200 | [diff] [blame] | 20 | |
| Jan-Marek Glogowski | fa1be53 | 2017-08-04 13:46:44 +0200 | [diff] [blame] | 21 | B.6. A small set of priorities instead of an flexible value AKA int |
| 22 | |
| 23 | There are some consequences due to this design. |
| 24 | |
| Andrea Gelmini | 64a3124 | 2017-08-17 16:41:20 +0200 | [diff] [blame] | 25 | C.1. Higher priority tasks starve lower priority tasks |
| Jan-Marek Glogowski | fa1be53 | 2017-08-04 13:46:44 +0200 | [diff] [blame] | 26 | As long as a higher task is available, lower tasks are never run! |
| 27 | See Anti-pattern. |
| 28 | |
| 29 | C.2. Tasks should be split into sensible blocks |
| 30 | If this can't really be done, process pending tasks by calling |
| Hossein | dc984e7 | 2021-10-20 10:55:25 +0200 | [diff] [blame] | 31 | `Application::Reschedule()`. Or use a thread. |
| Jan-Marek Glogowski | fa1be53 | 2017-08-04 13:46:44 +0200 | [diff] [blame] | 32 | |
| 33 | C.3. This is not an OS scheduler |
| 34 | There is no real way to "fix" B.2. and B.3. |
| Andrea Gelmini | 64a3124 | 2017-08-17 16:41:20 +0200 | [diff] [blame] | 35 | If you need to do a preemptive task, use a thread! |
| Jan-Marek Glogowski | 6e13585 | 2019-07-07 23:09:15 +0200 | [diff] [blame] | 36 | Otherwise make your task suspendable. |
| Jan-Marek Glogowski | fa1be53 | 2017-08-04 13:46:44 +0200 | [diff] [blame] | 37 | |
| 38 | |
| Hossein | dc984e7 | 2021-10-20 10:55:25 +0200 | [diff] [blame] | 39 | ## Driving the scheduler AKA the system timer |
| Jan-Marek Glogowski | fa1be53 | 2017-08-04 13:46:44 +0200 | [diff] [blame] | 40 | |
| 41 | 1. There is just one system timer, which drives LO event loop |
| 42 | 2. The timer has to run in the main window thread |
| 43 | 3. The scheduler is run with the Solar mutex acquired |
| 44 | 4. The system timer is a single-shot timer |
| 45 | 5. The scheduler system event / message has a low system priority. |
| 46 | All system events should have a higher priority. |
| 47 | |
| Andrea Gelmini | 64a3124 | 2017-08-17 16:41:20 +0200 | [diff] [blame] | 48 | Every time a task is started, the scheduler timer is adjusted. When the timer |
| Jan-Marek Glogowski | fa1be53 | 2017-08-04 13:46:44 +0200 | [diff] [blame] | 49 | fires, it posts an event to the system message queue. If the next most |
| Andrea Gelmini | 64a3124 | 2017-08-17 16:41:20 +0200 | [diff] [blame] | 50 | important task is an Idle (AKA instant, 0ms timeout), the event is pushed to |
| Jan-Marek Glogowski | fa1be53 | 2017-08-04 13:46:44 +0200 | [diff] [blame] | 51 | the back of the queue, so we don't starve system messages, otherwise to the |
| Jan-Marek Glogowski | 6e13585 | 2019-07-07 23:09:15 +0200 | [diff] [blame] | 52 | front. |
| Jan-Marek Glogowski | fa1be53 | 2017-08-04 13:46:44 +0200 | [diff] [blame] | 53 | |
| Andrea Gelmini | 64a3124 | 2017-08-17 16:41:20 +0200 | [diff] [blame] | 54 | Every time the scheduler is invoked it searches for the next task to process, |
| Jan-Marek Glogowski | fa1be53 | 2017-08-04 13:46:44 +0200 | [diff] [blame] | 55 | restarts the timer with the timeout for the next event and then invokes the |
| 56 | task. After invoking the task and if the task is still active, it is pushed |
| 57 | to the end of the queue and the timeout is eventually adjusted. |
| 58 | |
| 59 | |
| Hossein | dc984e7 | 2021-10-20 10:55:25 +0200 | [diff] [blame] | 60 | ## Locking |
| Jan-Marek Glogowski | fa1be53 | 2017-08-04 13:46:44 +0200 | [diff] [blame] | 61 | |
| Andrea Gelmini | 64a3124 | 2017-08-17 16:41:20 +0200 | [diff] [blame] | 62 | The locking is quite primitive: all interaction with internal Scheduler |
| Hossein | dc984e7 | 2021-10-20 10:55:25 +0200 | [diff] [blame] | 63 | structures are locked. This includes the `ImplSchedulerContext` and the |
| 64 | `Task::mpSchedulerData`, which is actually a part of the scheduler. |
| Jan-Marek Glogowski | fa1be53 | 2017-08-04 13:46:44 +0200 | [diff] [blame] | 65 | Before invoking the task, we have to release the lock, so others can |
| 66 | Start new Tasks. |
| 67 | |
| Andrea Gelmini | 1d3421c | 2020-12-07 22:42:13 +0100 | [diff] [blame] | 68 | The Scheduler just processes its own Tasks in the main thread and needs |
| Hossein | dc984e7 | 2021-10-20 10:55:25 +0200 | [diff] [blame] | 69 | the `SolarMutex` for it and for `DeInit` (tested by `DBG_TESTSOLARMUTEX`). All |
| Stephan Bergmann | 69a9b48 | 2020-12-08 16:04:48 +0100 | [diff] [blame] | 70 | the other interaction just take the scheduler mutex or don't need locking |
| Jan-Marek Glogowski | e4b7027 | 2020-12-04 21:05:28 +0100 | [diff] [blame] | 71 | at all. |
| 72 | |
| Andrea Gelmini | 1d3421c | 2020-12-07 22:42:13 +0100 | [diff] [blame] | 73 | There is a "workaround" for static Task objects, which would crash LO on |
| Hossein | dc984e7 | 2021-10-20 10:55:25 +0200 | [diff] [blame] | 74 | destruction, because `Task::~Task` would try to de-register itself in the |
| 75 | Scheduler, while the `SchedulerLock` would be long gone. OTOH this makes |
| Jan-Marek Glogowski | e4b7027 | 2020-12-04 21:05:28 +0100 | [diff] [blame] | 76 | Task handling less error-prone, than doing "special" manual cleanup. |
| 77 | There is also a "= TODOs and ideas =" to get rid if static Tasks. |
| 78 | |
| Stephan Bergmann | 69a9b48 | 2020-12-08 16:04:48 +0100 | [diff] [blame] | 79 | Actually the scheduler mutex should never be locked when calling into |
| 80 | non-scheduler code, so it was converted to a non-recursive |
| Hossein | dc984e7 | 2021-10-20 10:55:25 +0200 | [diff] [blame] | 81 | `std::mutex`. |
| Jan-Marek Glogowski | 84af20e | 2020-12-05 12:42:11 +0100 | [diff] [blame] | 82 | |
| Jan-Marek Glogowski | fa1be53 | 2017-08-04 13:46:44 +0200 | [diff] [blame] | 83 | |
| Hossein | dc984e7 | 2021-10-20 10:55:25 +0200 | [diff] [blame] | 84 | ## Idle processing |
| Luboš Luňák | d3b498c | 2021-02-07 16:34:38 +0100 | [diff] [blame] | 85 | |
| 86 | Confusingly, there are 2 concepts that are called 'idle': |
| 87 | |
| 88 | * Instant (zero timeout) tasks, represented e.g. by the Idle class. This is |
| 89 | a misnomer, as these tasks are processed after returning to the main loop. |
| 90 | This is not necessarily when LO is idle, in fact such tasks may be invoked |
| 91 | while there is input in the OS event queue pending. |
| 92 | (TODO: This case should be fixed by renaming.) |
| 93 | |
| Hossein | dc984e7 | 2021-10-20 10:55:25 +0200 | [diff] [blame] | 94 | * Low priority tasks, represented by priorities `TaskPriority::HIGH_IDLE` and |
| 95 | lower. In addition to being invoked only when there is no task with a higher |
| 96 | priority, pending input in the OS event queue also takes precedence. |
| Luboš Luňák | d3b498c | 2021-02-07 16:34:38 +0100 | [diff] [blame] | 97 | |
| 98 | |
| Hossein | dc984e7 | 2021-10-20 10:55:25 +0200 | [diff] [blame] | 99 | ## Lifecycle / thread-safety of Scheduler-based objects |
| Jan-Marek Glogowski | fa1be53 | 2017-08-04 13:46:44 +0200 | [diff] [blame] | 100 | |
| 101 | A scheduler object it thread-safe in the way, that it can be associated to |
| 102 | any thread and any thread is free to call any functions on it. The owner must |
| Hossein | dc984e7 | 2021-10-20 10:55:25 +0200 | [diff] [blame] | 103 | guarantee that the `Invoke()` function can be called, while the Scheduler object |
| Jan-Marek Glogowski | fa1be53 | 2017-08-04 13:46:44 +0200 | [diff] [blame] | 104 | exists / is not disposed. |
| 105 | |
| 106 | |
| Hossein | dc984e7 | 2021-10-20 10:55:25 +0200 | [diff] [blame] | 107 | ## Anti-pattern: Dependencies via (fine grained) priorities |
| Jan-Marek Glogowski | fa1be53 | 2017-08-04 13:46:44 +0200 | [diff] [blame] | 108 | |
| 109 | "Idle 1" should run before "Idle 2", therefore give "Idle 1" a higher priority |
| 110 | then "Idle 2". This just works correct for low frequency idles, but otherwise |
| 111 | always breaks! |
| 112 | |
| 113 | If you have some longer work - even if it can be split by into schedulable, |
| 114 | smaller blocks - you normally don't want to schedule it with a non-default |
| 115 | priority, as it starves all lower priority tasks. Even if a block was processed |
| 116 | in "Idle 1", it is scheduled with the same (higher) priority again. Changing |
| 117 | the "Idle" to a "Timer" also won't work, as this breaks the dependency. |
| 118 | |
| 119 | What is needed is task based dependency handling, so if "Task 1" is done, it |
| 120 | has to start "Task 2" and if "Task 1" is started again, it has to stop |
| 121 | "Task 2". This currently has to be done by the implementor, but this feature |
| 122 | can be added to the scheduler reasonably. |
| 123 | |
| 124 | |
| Hossein | dc984e7 | 2021-10-20 10:55:25 +0200 | [diff] [blame] | 125 | ## Implementation details |
| Jan-Marek Glogowski | fa1be53 | 2017-08-04 13:46:44 +0200 | [diff] [blame] | 126 | |
| Hossein | dc984e7 | 2021-10-20 10:55:25 +0200 | [diff] [blame] | 127 | ### General: event priority for `DoYield` |
| Jan-Marek Glogowski | 47ee809 | 2017-10-16 16:59:45 +0200 | [diff] [blame] | 128 | |
| 129 | There are three types of events, with different priority: |
| 130 | |
| 131 | 1. LO user events |
| 132 | 2. System events |
| 133 | 3. LO Scheduler event |
| 134 | |
| 135 | They should be processed according to the following code: |
| 136 | |
| Hossein | dc984e7 | 2021-10-20 10:55:25 +0200 | [diff] [blame] | 137 | ``` |
| Jan-Marek Glogowski | 5fbc7ab | 2021-06-25 09:55:20 +0200 | [diff] [blame] | 138 | bool ImplYield(bool bWait, bool bAllCurrent) |
| Jan-Marek Glogowski | 47ee809 | 2017-10-16 16:59:45 +0200 | [diff] [blame] | 139 | { |
| Jan-Marek Glogowski | 5fbc7ab | 2021-06-25 09:55:20 +0200 | [diff] [blame] | 140 | DBG_TESTSOLARMUTEX(); |
| 141 | assert(IsMainThread()); |
| 142 | |
| Jan-Marek Glogowski | 47ee809 | 2017-10-16 16:59:45 +0200 | [diff] [blame] | 143 | bool bWasEvent = ProcessUserEvents( bAllCurrent ); |
| 144 | if ( !bAllCurrent && bWasEvent ) |
| 145 | return true; |
| Jan-Marek Glogowski | 5fbc7ab | 2021-06-25 09:55:20 +0200 | [diff] [blame] | 146 | |
| 147 | SolarMutexReleaser(); |
| Jan-Marek Glogowski | 47ee809 | 2017-10-16 16:59:45 +0200 | [diff] [blame] | 148 | bWasEvent = ProcessSystemEvents( bAllCurrent, &bWasSchedulerEvent ) || bWasEvent; |
| 149 | if ( !bWasSchedulerEvent && IsSchedulerEvent() ) |
| 150 | { |
| 151 | ProcessSchedulerEvent() |
| 152 | bWasEvent = true; |
| 153 | } |
| 154 | if ( !bWasEvent && bWait ) |
| 155 | { |
| 156 | WaitForSystemEvents(); |
| 157 | bWasEvent = true; |
| 158 | } |
| 159 | return bWasEvent; |
| 160 | } |
| Hossein | dc984e7 | 2021-10-20 10:55:25 +0200 | [diff] [blame] | 161 | ``` |
| Jan-Marek Glogowski | 47ee809 | 2017-10-16 16:59:45 +0200 | [diff] [blame] | 162 | |
| Hossein | dc984e7 | 2021-10-20 10:55:25 +0200 | [diff] [blame] | 163 | ### General: main thread deferral |
| Jan-Marek Glogowski | 4baec72 | 2017-08-28 19:58:32 +0200 | [diff] [blame] | 164 | |
| Jan-Marek Glogowski | 6e13585 | 2019-07-07 23:09:15 +0200 | [diff] [blame] | 165 | In almost all VCL backends, we run main thread deferrals by disabling the |
| Hossein | dc984e7 | 2021-10-20 10:55:25 +0200 | [diff] [blame] | 166 | `SolarMutex` using a boolean. In the case of the redirect, this makes |
| 167 | `tryToAcquire` and `doAcquire` return `true` or `1`, while a release is ignored. |
| 168 | Also the `IsCurrentThread()` mutex check function will act accordingly, so all |
| 169 | the `DBG_TESTSOLARMUTEX` won't fail. |
| Jan-Marek Glogowski | 4baec72 | 2017-08-28 19:58:32 +0200 | [diff] [blame] | 170 | |
| 171 | Since we just disable the locks when we start running the deferred code in the |
| 172 | main thread, we won't let the main thread run into stuff, where it would |
| 173 | normally wait for the SolarMutex. |
| 174 | |
| Hossein | dc984e7 | 2021-10-20 10:55:25 +0200 | [diff] [blame] | 175 | Eventually this will move into the `SolarMutex`. KDE / Qt also does main |
| 176 | thread redirects using `Qt::BlockingQueuedConnection`. |
| Jan-Marek Glogowski | 4baec72 | 2017-08-28 19:58:32 +0200 | [diff] [blame] | 177 | |
| Hossein | dc984e7 | 2021-10-20 10:55:25 +0200 | [diff] [blame] | 178 | ### General: processing all current events for `DoYield` |
| Jan-Marek Glogowski | 9679fb2 | 2017-08-29 10:29:51 +0200 | [diff] [blame] | 179 | |
| 180 | This is easily implemented for all non-priority queue based implementations. |
| Jan-Marek Glogowski | da5cdcd | 2017-09-29 21:02:17 +0200 | [diff] [blame] | 181 | Windows and macOS both have a timestamp attached to their events / messages, |
| Jan-Marek Glogowski | 9679fb2 | 2017-08-29 10:29:51 +0200 | [diff] [blame] | 182 | so simply get the current time and just process anything < timestamp. |
| Hossein | dc984e7 | 2021-10-20 10:55:25 +0200 | [diff] [blame] | 183 | For the **KDE** backend this is already the default behaviour - single event |
| Jan-Marek Glogowski | 9679fb2 | 2017-08-29 10:29:51 +0200 | [diff] [blame] | 184 | processing isn't even supported. The headless backend accomplishes this by |
| 185 | just processing a copy of the list of current events. |
| 186 | |
| Hossein | dc984e7 | 2021-10-20 10:55:25 +0200 | [diff] [blame] | 187 | Problematic in this regard is the **Gtk+** backend. `g_main_context_iteration` |
| Jan-Marek Glogowski | 9679fb2 | 2017-08-29 10:29:51 +0200 | [diff] [blame] | 188 | dispatches "only those highest priority event sources". There is no real way |
| 189 | to tell, when these became ready. I've added a workaround idea to the TODO |
| Hossein | dc984e7 | 2021-10-20 10:55:25 +0200 | [diff] [blame] | 190 | list. FWIW: **Qt** runs just a single timer source in the glib main context, |
| Jan-Marek Glogowski | 9679fb2 | 2017-08-29 10:29:51 +0200 | [diff] [blame] | 191 | basically the same we're doing with the LO scheduler as a system event. |
| 192 | |
| Hossein | dc984e7 | 2021-10-20 10:55:25 +0200 | [diff] [blame] | 193 | The gen **X11** backend has some levels of redirection, but needs quite some work |
| Jan-Marek Glogowski | 9679fb2 | 2017-08-29 10:29:51 +0200 | [diff] [blame] | 194 | to get this fixed. |
| 195 | |
| Hossein | dc984e7 | 2021-10-20 10:55:25 +0200 | [diff] [blame] | 196 | ### General: non-main thread yield |
| Jan-Marek Glogowski | ce8bbb7 | 2017-08-29 09:40:01 +0200 | [diff] [blame] | 197 | |
| 198 | Yielding from a non-main thread must not wait in the main thread, as this |
| 199 | may block the main thread until some events happen. |
| 200 | |
| 201 | Currently we wait on an extra conditional, which is cleared by the main event |
| 202 | loop. |
| 203 | |
| Hossein | dc984e7 | 2021-10-20 10:55:25 +0200 | [diff] [blame] | 204 | ### General: invalidation of elapsed timer event messages |
| Jan-Marek Glogowski | da5cdcd | 2017-09-29 21:02:17 +0200 | [diff] [blame] | 205 | |
| 206 | Since the system timer to run the scheduler is single-shot, there should never |
| Andrea Gelmini | e6c0369 | 2019-08-09 17:13:59 +0200 | [diff] [blame] | 207 | be more than one elapsed timer event in system event queue. When stopping or |
| Jan-Marek Glogowski | da5cdcd | 2017-09-29 21:02:17 +0200 | [diff] [blame] | 208 | restarting the timer, we eventually have to remove the now invalid event from |
| 209 | the queue. |
| 210 | |
| 211 | But for the Windows and macOS backends this may fail as they have delayed |
| 212 | posting of events, so a consecutive remove after a post will actually yield no |
| 213 | remove. On Windows we even get unwanted processing of events outside of the |
| 214 | main event loop, which may call the Scheduler, as timer management is handled |
| 215 | in critical scheduler code. |
| 216 | |
| 217 | To prevent these problems, we don't even try to remove these events, but |
| 218 | invalidate them by versioning the timer events. Timer events with invalid |
| 219 | versions are processed but simply don't run the scheduler. |
| 220 | |
| Hossein | dc984e7 | 2021-10-20 10:55:25 +0200 | [diff] [blame] | 221 | ### General: track time of long running tasks |
| Jan-Marek Glogowski | 6e13585 | 2019-07-07 23:09:15 +0200 | [diff] [blame] | 222 | |
| Hossein | dc984e7 | 2021-10-20 10:55:25 +0200 | [diff] [blame] | 223 | There is `TaskStopwatch` class. It'll track the time and report a timeout either |
| Jan-Marek Glogowski | 6e13585 | 2019-07-07 23:09:15 +0200 | [diff] [blame] | 224 | when the tasks time slice is finished or some system event did occur. |
| 225 | |
| 226 | Eventually it will be merged into the main scheduler, so each invoked task can |
| 227 | easily track it's runtime and eventually this can be used to "blame" / find |
| 228 | other long running tasks, so interactivity can be improved. |
| 229 | |
| 230 | There were some questions coming up when implementing it: |
| 231 | |
| Hossein | 67615c8 | 2022-07-30 20:30:19 +0200 | [diff] [blame] | 232 | #### Why does the scheduler not detect that we only have idle tasks pending, and skip the instant timeout? |
| Jan-Marek Glogowski | 6e13585 | 2019-07-07 23:09:15 +0200 | [diff] [blame] | 233 | |
| 234 | You never know how long a task will run. Currently the scheduler simply asks |
| Andrea Gelmini | 08097a2 | 2019-08-13 17:05:37 +0200 | [diff] [blame] | 235 | each task when it'll be ready to run, until two runnable tasks are found. |
| Jan-Marek Glogowski | 6e13585 | 2019-07-07 23:09:15 +0200 | [diff] [blame] | 236 | Normally this is very quick, as LO has a lot of one-shot instant tasks / Idles |
| 237 | and just a very few long term pending Timers. |
| 238 | |
| 239 | Especially UNO calls add a lot of Idles to the task list, which just need to |
| 240 | be processed in order. |
| 241 | |
| Hossein | dc984e7 | 2021-10-20 10:55:25 +0200 | [diff] [blame] | 242 | #### Why not use things like Linux timer wheels? |
| Jan-Marek Glogowski | 6e13585 | 2019-07-07 23:09:15 +0200 | [diff] [blame] | 243 | |
| 244 | LO has relatively few timers and a lot one-shot Idles. 99% of time the search |
| 245 | for the next task is quick, because there are just ~5 long term timers per |
| 246 | document (cache invalidation, cursor blinking etc.). |
| 247 | |
| 248 | This might become a problem, if you have a lot of open documents, so the long |
| 249 | term timer list increases AKA for highly loaded LOOL instances. |
| 250 | |
| 251 | But the Linux timer wheel mainly relies on the facts that the OS timers are |
| 252 | expected to not expire, as they are use to catch "error" timeouts, which rarely |
| Andrea Gelmini | 08097a2 | 2019-08-13 17:05:37 +0200 | [diff] [blame] | 253 | happen, so this definitely not matches LO's usage. |
| Jan-Marek Glogowski | 6e13585 | 2019-07-07 23:09:15 +0200 | [diff] [blame] | 254 | |
| Hossein | dc984e7 | 2021-10-20 10:55:25 +0200 | [diff] [blame] | 255 | #### Not really usable to find misbehaving tasks |
| Jan-Marek Glogowski | 6e13585 | 2019-07-07 23:09:15 +0200 | [diff] [blame] | 256 | |
| 257 | The TaskStopwatch class is just a little time keeper + detecting of input |
| 258 | events. This is not about misbehaving Tasks, but long running tasks, which |
| 259 | have to yield to the Scheduler, so other Tasks and System events can be |
| 260 | processed. |
| 261 | |
| 262 | There is the TODO to merge the functionality into the Scheduler itself, at |
| 263 | which point we can think about profiling individual Tasks to improve |
| 264 | interactivity. |
| 265 | |
| Hossein | dc984e7 | 2021-10-20 10:55:25 +0200 | [diff] [blame] | 266 | ### macOS implementation details |
| Jan-Marek Glogowski | fa1be53 | 2017-08-04 13:46:44 +0200 | [diff] [blame] | 267 | |
| Jan-Marek Glogowski | 33b094a | 2017-08-15 08:23:31 +0200 | [diff] [blame] | 268 | Generally the Scheduler is handled as expected, except on resize, which is |
| Jan-Marek Glogowski | da5cdcd | 2017-09-29 21:02:17 +0200 | [diff] [blame] | 269 | handled with different runloop-modes in macOS. In case of a resize, the normal |
| Hossein | dc984e7 | 2021-10-20 10:55:25 +0200 | [diff] [blame] | 270 | `runloop` is suspended in `sendEvent`, so we can't call the scheduler via posted |
| Jan-Marek Glogowski | 9796f79 | 2017-08-08 15:03:37 +0200 | [diff] [blame] | 271 | main loop-events. Instead the scheduler uses the timer again. |
| 272 | |
| 273 | Like the Windows backend, all Cocoa / GUI handling also has to be run in |
| 274 | the main thread. We're emulating Windows out-of-order PeekMessage processing, |
| Hossein | dc984e7 | 2021-10-20 10:55:25 +0200 | [diff] [blame] | 275 | via a `YieldWakeupEvent` and two conditionals. When in a `RUNINMAIN` call, all |
| 276 | the `DBG_TESTSOLARMUTEX` calls are disabled, as we can't release the |
| 277 | `SolarMutex`, but we can prevent running any other `SolarMutex` based code. |
| 278 | Those wakeup events must be ignored to prevent busy-locks. For more info read |
| 279 | the "General: main thread deferral" section. |
| Jan-Marek Glogowski | 9796f79 | 2017-08-08 15:03:37 +0200 | [diff] [blame] | 280 | |
| Hossein | 67615c8 | 2022-07-30 20:30:19 +0200 | [diff] [blame] | 281 | We can neither rely on macOS `dispatch_sync` code block execution nor the |
| Andrea Gelmini | ddcdd4a | 2017-09-22 11:23:45 +0200 | [diff] [blame] | 282 | message handling, as both can't be prioritized or filtered and the first |
| Jan-Marek Glogowski | 9796f79 | 2017-08-08 15:03:37 +0200 | [diff] [blame] | 283 | does also not allow nested execution and is just processed in sequence. |
| Jan-Marek Glogowski | 33b094a | 2017-08-15 08:23:31 +0200 | [diff] [blame] | 284 | |
| 285 | There is also a workaround for a problem for pushing tasks to an empty queue, |
| Hossein | 67615c8 | 2022-07-30 20:30:19 +0200 | [diff] [blame] | 286 | as `[NSApp postEvent: ... atStart: NO]` doesn't append the event, if the |
| Jan-Marek Glogowski | 33b094a | 2017-08-15 08:23:31 +0200 | [diff] [blame] | 287 | message queue is empty. |
| Jan-Marek Glogowski | fa1be53 | 2017-08-04 13:46:44 +0200 | [diff] [blame] | 288 | |
| Jan-Marek Glogowski | c69c1ee | 2017-10-14 01:43:07 +0200 | [diff] [blame] | 289 | An additional problem is the filtering of events on Window close. This drops |
| 290 | posted timer events, when a Window is closed resulting in a busy DoYield loop, |
| 291 | so we have to re-post the event, after closing a window. |
| 292 | |
| Hossein | dc984e7 | 2021-10-20 10:55:25 +0200 | [diff] [blame] | 293 | ### Windows implementation details |
| Jan-Marek Glogowski | fa1be53 | 2017-08-04 13:46:44 +0200 | [diff] [blame] | 294 | |
| 295 | Posted or sent event messages often trigger processing of WndProc in |
| Hossein | dc984e7 | 2021-10-20 10:55:25 +0200 | [diff] [blame] | 296 | `PeekMessage`, `GetMessage` or `DispatchMessage`, independently from the message |
| 297 | to fetch, remove or dispatch ("During this call, the system delivers pending, |
| Jan-Marek Glogowski | fa1be53 | 2017-08-04 13:46:44 +0200 | [diff] [blame] | 298 | nonqueued messages..."). Additionally messages have an inherited priority |
| Hossein | dc984e7 | 2021-10-20 10:55:25 +0200 | [diff] [blame] | 299 | based on the function used to generate them. Even if `WM_TIMER` messages should |
| 300 | have the lowest priority, a manually posted `WM_TIMER` is processed with the |
| Jan-Marek Glogowski | da5cdcd | 2017-09-29 21:02:17 +0200 | [diff] [blame] | 301 | priority of a PostMessage message. |
| Jan-Marek Glogowski | fa1be53 | 2017-08-04 13:46:44 +0200 | [diff] [blame] | 302 | |
| Jan-Marek Glogowski | fb4e7be | 2017-10-12 16:00:42 +0200 | [diff] [blame] | 303 | So we're giving up on processing all our Scheduler events as a message in the |
| 304 | system message loop. Instead we just indicate a 0ms timer message by setting |
| Hossein | dc984e7 | 2021-10-20 10:55:25 +0200 | [diff] [blame] | 305 | the `m_bDirectTimeout` in the timer object. This timer is always processed, if |
| Jan-Marek Glogowski | fb4e7be | 2017-10-12 16:00:42 +0200 | [diff] [blame] | 306 | the system message wasn't already our timer. As a result we can also skip the |
| 307 | polling. All this is one more reason to drop the single message processing |
| 308 | in favour of always processing all pending (system) events. |
| Jan-Marek Glogowski | fa1be53 | 2017-08-04 13:46:44 +0200 | [diff] [blame] | 309 | |
| Andrea Gelmini | d71859e | 2018-09-02 06:16:28 +0200 | [diff] [blame] | 310 | There is another special case, we have to handle: window updates during move |
| Jan-Marek Glogowski | abe3af8 | 2017-10-12 18:19:12 +0200 | [diff] [blame] | 311 | and resize of windows. These system actions run in their own nested message |
| Hossein | dc984e7 | 2021-10-20 10:55:25 +0200 | [diff] [blame] | 312 | loop. So we have to completely switch to timers, even for 0 ms. But these |
| Jan-Marek Glogowski | abe3af8 | 2017-10-12 18:19:12 +0200 | [diff] [blame] | 313 | posted events prevent any event processing, while we're busy. The only viable |
| Hossein | dc984e7 | 2021-10-20 10:55:25 +0200 | [diff] [blame] | 314 | solution seems to be to switch to `WM_TIMER` based timers, as these generate |
| 315 | messages with the lowest system priority (but they don't allow 0 ms timeouts). |
| Jan-Marek Glogowski | abe3af8 | 2017-10-12 18:19:12 +0200 | [diff] [blame] | 316 | So processing slows down during resize and move, but we gain working painting, |
| 317 | even when busy. |
| 318 | |
| Jan-Marek Glogowski | 448e9da | 2017-08-24 13:41:37 +0200 | [diff] [blame] | 319 | An additional workaround is implemented for the delayed queuing of posted |
| Hossein | dc984e7 | 2021-10-20 10:55:25 +0200 | [diff] [blame] | 320 | messages, where `PeekMessage` in `WinSalTimer::Stop()` won't be able remove the |
| Jan-Marek Glogowski | da5cdcd | 2017-09-29 21:02:17 +0200 | [diff] [blame] | 321 | just posted timer callback message. See "General: invalidation of elapsed |
| 322 | timer event messages" for the details. |
| Jan-Marek Glogowski | 448e9da | 2017-08-24 13:41:37 +0200 | [diff] [blame] | 323 | |
| Jan-Marek Glogowski | 4baec72 | 2017-08-28 19:58:32 +0200 | [diff] [blame] | 324 | To run the required GUI code in the main thread without unlocking the |
| Hossein | dc984e7 | 2021-10-20 10:55:25 +0200 | [diff] [blame] | 325 | `SolarMutex`, we "disable" it. For more infos read the "General: main thread |
| Jan-Marek Glogowski | 4baec72 | 2017-08-28 19:58:32 +0200 | [diff] [blame] | 326 | deferral" section. |
| 327 | |
| Hossein | dc984e7 | 2021-10-20 10:55:25 +0200 | [diff] [blame] | 328 | ### KDE implementation details |
| Jan-Marek Glogowski | fa1be53 | 2017-08-04 13:46:44 +0200 | [diff] [blame] | 329 | |
| Andrea Gelmini | 64a3124 | 2017-08-17 16:41:20 +0200 | [diff] [blame] | 330 | This implementation also works as intended. But there is a different Yield |
| Hossein | dc984e7 | 2021-10-20 10:55:25 +0200 | [diff] [blame] | 331 | handling, because Qts `QAbstractEventDispatcher::processEvents` will always |
| Jan-Marek Glogowski | fa1be53 | 2017-08-04 13:46:44 +0200 | [diff] [blame] | 332 | process all pending events. |
| 333 | |
| 334 | |
| Hossein | dc984e7 | 2021-10-20 10:55:25 +0200 | [diff] [blame] | 335 | ## TODOs and ideas |
| Jan-Marek Glogowski | fa1be53 | 2017-08-04 13:46:44 +0200 | [diff] [blame] | 336 | |
| Hossein | dc984e7 | 2021-10-20 10:55:25 +0200 | [diff] [blame] | 337 | ### Task dependencies AKA children |
| Jan-Marek Glogowski | fa1be53 | 2017-08-04 13:46:44 +0200 | [diff] [blame] | 338 | |
| 339 | Every task can have a list of children / a child. |
| 340 | |
| 341 | * When a task is stopped, the children are started. |
| 342 | * When a task is started, the children are stopped. |
| 343 | |
| 344 | This should be easy to implement. |
| 345 | |
| Hossein | dc984e7 | 2021-10-20 10:55:25 +0200 | [diff] [blame] | 346 | ### Per priority time-sorted queues |
| Jan-Marek Glogowski | fa1be53 | 2017-08-04 13:46:44 +0200 | [diff] [blame] | 347 | |
| 348 | This would result in O(1) scheduler. It was used in the Linux kernel for some |
| Andrea Gelmini | 64a3124 | 2017-08-17 16:41:20 +0200 | [diff] [blame] | 349 | time (search Ingo Molnar's O(1) scheduler). This can be a scheduling |
| Jan-Marek Glogowski | fa1be53 | 2017-08-04 13:46:44 +0200 | [diff] [blame] | 350 | optimization, which would prevent walking longer event list. But probably the |
| 351 | management overhead would be too large, as we have many one-shot events. |
| 352 | |
| 353 | To find the next task the scheduler just walks the (constant) list of priority |
| 354 | queues and schedules the first ready event of any queue. |
| 355 | |
| 356 | The downside of this approach: Insert / Start / Reschedule(for "auto" tasks) |
| 357 | now need O(log(n)) to find the position in the queue of the priority. |
| 358 | |
| Hossein | dc984e7 | 2021-10-20 10:55:25 +0200 | [diff] [blame] | 359 | ### Always process all (higher priority) pending events |
| Jan-Marek Glogowski | fa1be53 | 2017-08-04 13:46:44 +0200 | [diff] [blame] | 360 | |
| Hossein | dc984e7 | 2021-10-20 10:55:25 +0200 | [diff] [blame] | 361 | Currently `Application::Reschedule()` processes a single event or "all" events, |
| Jan-Marek Glogowski | fa1be53 | 2017-08-04 13:46:44 +0200 | [diff] [blame] | 362 | with "all" defined as "100 events" in most backends. This already is ignored |
| Hossein | dc984e7 | 2021-10-20 10:55:25 +0200 | [diff] [blame] | 363 | by the KDE backend, as Qt defines its `QAbstractEventDispatcher::processEvents` |
| Jan-Marek Glogowski | fa1be53 | 2017-08-04 13:46:44 +0200 | [diff] [blame] | 364 | processing all pending events (there are ways to skip event classes, but no |
| 365 | easy way to process just a single event). |
| 366 | |
| 367 | Since the Scheduler is always handled by the system message queue, there is |
| 368 | really no more reasoning to stop after 100 events to prevent LO Scheduler |
| 369 | starvation. |
| Jan-Marek Glogowski | 9796f79 | 2017-08-08 15:03:37 +0200 | [diff] [blame] | 370 | |
| Hossein | dc984e7 | 2021-10-20 10:55:25 +0200 | [diff] [blame] | 371 | ### Drop static inherited or composed Task objects |
| Jan-Marek Glogowski | 06ce312 | 2017-08-23 16:07:50 +0200 | [diff] [blame] | 372 | |
| 373 | The sequence of destruction of static objects is not defined. So a static Task |
| 374 | can not be guaranteed to happen before the Scheduler. When dynamic unloading |
| 375 | is involved, this becomes an even worse problem. This way we could drop the |
| 376 | mbStatic workaround from the Task class. |
| 377 | |
| Hossein | dc984e7 | 2021-10-20 10:55:25 +0200 | [diff] [blame] | 378 | ### Run the LO application in its own thread |
| Jan-Marek Glogowski | 9796f79 | 2017-08-08 15:03:37 +0200 | [diff] [blame] | 379 | |
| Jan-Marek Glogowski | da5cdcd | 2017-09-29 21:02:17 +0200 | [diff] [blame] | 380 | This would probably get rid of most of the macOS and Windows implementation |
| Jan-Marek Glogowski | 9796f79 | 2017-08-08 15:03:37 +0200 | [diff] [blame] | 381 | details / workarounds, but is quite probably a large amount of work. |
| 382 | |
| 383 | Instead of LO running in the main process / thread, we run it in a 2nd thread |
| 384 | and defer al GUI calls to the main thread. This way it'll hopefully not block |
| 385 | and can process system events. |
| 386 | |
| Andrea Gelmini | ddcdd4a | 2017-09-22 11:23:45 +0200 | [diff] [blame] | 387 | That's just a theory - it definitely needs more analysis before even attending |
| Jan-Marek Glogowski | 9796f79 | 2017-08-08 15:03:37 +0200 | [diff] [blame] | 388 | an implementation. |
| Jan-Marek Glogowski | 448e9da | 2017-08-24 13:41:37 +0200 | [diff] [blame] | 389 | |
| Hossein | dc984e7 | 2021-10-20 10:55:25 +0200 | [diff] [blame] | 390 | ### Re-evaluate the macOS `ImplNSAppPostEvent` |
| Jan-Marek Glogowski | 448e9da | 2017-08-24 13:41:37 +0200 | [diff] [blame] | 391 | |
| 392 | Probably a solution comparable to the Windows backends delayed PostMessage |
| 393 | workaround using a validation timestamp is better then the current peek, |
| 394 | remove, re-postEvent, which has to run in the main thread. |
| 395 | |
| 396 | Originally I didn't evaluate, if the event is actually lost or just delayed. |
| Jan-Marek Glogowski | 9679fb2 | 2017-08-29 10:29:51 +0200 | [diff] [blame] | 397 | |
| Hossein | dc984e7 | 2021-10-20 10:55:25 +0200 | [diff] [blame] | 398 | ### Drop `nMaxEvents` from Gtk+ based backends |
| Jan-Marek Glogowski | 9679fb2 | 2017-08-29 10:29:51 +0200 | [diff] [blame] | 399 | |
| Hossein | dc984e7 | 2021-10-20 10:55:25 +0200 | [diff] [blame] | 400 | ``` |
| Jan-Marek Glogowski | 9679fb2 | 2017-08-29 10:29:51 +0200 | [diff] [blame] | 401 | gint last_priority = G_MAXINT; |
| 402 | bool bWasEvent = false; |
| 403 | do { |
| 404 | gint max_priority; |
| 405 | g_main_context_acquire( NULL ); |
| 406 | bool bHasPending = g_main_context_prepare( NULL, &max_priority ); |
| 407 | g_main_context_release( NULL ); |
| 408 | if ( bHasPending ) |
| 409 | { |
| 410 | if ( last_priority > max_priority ) |
| 411 | { |
| 412 | bHasPending = g_main_context_iteration( NULL, bWait ); |
| 413 | bWasEvent = bWasEvent || bHasPending; |
| 414 | } |
| 415 | else |
| 416 | bHasPending = false; |
| 417 | } |
| 418 | } |
| 419 | while ( bHasPending ) |
| Hossein | dc984e7 | 2021-10-20 10:55:25 +0200 | [diff] [blame] | 420 | ``` |
| Jan-Marek Glogowski | 9679fb2 | 2017-08-29 10:29:51 +0200 | [diff] [blame] | 421 | |
| Hossein | 67615c8 | 2022-07-30 20:30:19 +0200 | [diff] [blame] | 422 | The idea is to use `g_main_context_prepare` and keep the `max_priority` as an |
| Jan-Marek Glogowski | 9679fb2 | 2017-08-29 10:29:51 +0200 | [diff] [blame] | 423 | indicator. We cannot prevent running newer lower events, but we can prevent |
| 424 | running new higher events, which should be sufficient for most stuff. |
| 425 | |
| 426 | This also touches user event processing, which currently runs as a high |
| 427 | priority idle in the event loop. |
| 428 | |
| Hossein | dc984e7 | 2021-10-20 10:55:25 +0200 | [diff] [blame] | 429 | ### Drop nMaxEvents from gen (X11) backend |
| Jan-Marek Glogowski | 9679fb2 | 2017-08-29 10:29:51 +0200 | [diff] [blame] | 430 | |
| Hossein | dc984e7 | 2021-10-20 10:55:25 +0200 | [diff] [blame] | 431 | A few layers of indirection make this code hard to follow. The `SalXLib::Yield` |
| 432 | and `SalX11Display::Yield` architecture makes it impossible to process just the |
| Andrea Gelmini | 2f066d6 | 2017-09-26 17:22:56 +0200 | [diff] [blame] | 433 | current events. This really needs a refactoring and rearchitecture step, which |
| Thorsten Behrens | 410bf59 | 2018-09-05 02:53:07 +0200 | [diff] [blame] | 434 | will also affect the Gtk+ and KDE backend for the user event handling. |
| Jan-Marek Glogowski | 6e13585 | 2019-07-07 23:09:15 +0200 | [diff] [blame] | 435 | |
| Hossein | dc984e7 | 2021-10-20 10:55:25 +0200 | [diff] [blame] | 436 | ### Merge TaskStopwatch functionality into the Scheduler |
| Jan-Marek Glogowski | 6e13585 | 2019-07-07 23:09:15 +0200 | [diff] [blame] | 437 | |
| 438 | This way it can be easier used to profile Tasks, eventually to improve LO's |
| 439 | interactivity. |
| Hossein | 67615c8 | 2022-07-30 20:30:19 +0200 | [diff] [blame] | 440 | |
| 441 | ## See Also |
| 442 | |
| 443 | - [Solar Mutex](https://wiki.openoffice.org/wiki/Terms/Solar_Mutex) |
| 444 | - [LibreOffice Main Loop](https://wiki.documentfoundation.org/Development/LHM_LiMux/Main_Loop) |
| 445 | - [AOO Advanced Threading-Architecture (proposal)](https://wiki.openoffice.org/wiki/Architecture/Proposal/Advanced_Threading-Architecture) |
| 446 | - [Revise OOo Multi-Threading Efforts](https://wiki.openoffice.org/wiki/Effort/Revise_OOo_Multi-Threading) |
| 447 | - [Multi-Threading Analysis](https://wiki.openoffice.org/wiki/Analysis/Multi-Threading) |
| 448 | - [AOO Wiki - Category:Multi-Threading](https://wiki.openoffice.org/wiki/Category:Multi-Threading) |