-
-
Notifications
You must be signed in to change notification settings - Fork 2.4k
Description
We are experiencing multiple issues in the area of "stuck" notifications and unread markers. Many are related to the way receipts interact with threaded messages.
Symptoms
We are seeing different symptoms of the problem:
- "Stuck" unread dots and notification counters: Even when I have read all messages in a room, the room is still marked as unread or having notifications.
- Unread dots or notification counters returning on app startup: When I restart the app (or refresh the page on Web) rooms which I have already read re-appear as unread.
- Notification counters growing and shrinking spontaneously after entering or scrolling in a room.
Spec-level causes
Message ordering
Fundamentally, in order to interpret the meaning of a receipt that says "I have read everything up to here", we need to know what order messages are in. This is not clear in the spec, and we propose to make it clear and explicit in MSC4033.
In the meantime, Element Web uses a combination of "sync" order (the implicit order of events arriving via a /sync request) and "timestamp" order (using the ts property within events).
Some of the existing bugs are probably caused by this inconsistency, but it is not clear yet how many: we believe there are also bugs in the implementation that cause additional problems, and this theoretical inconsistency is only the cause of a few problems.
Which thread the root belongs to
The spec has what we consider a bug when it talks about which thread the root message belongs to, which has been reflected in client code, making it inconsistent with the server implementation (at least on the Synapse server). We have a proposal to fix this bug in MSC4037.
Identifying which thread any message is in
It is sometimes difficult for clients to identify which thread an event belongs to, meaning that a receipt pointing to it is sometimes ignored. We have begun drafting MSC4023 to address this.
Other
Previously, we believed that MSC3981 (recursive relations) would solve some of the problems, but since that MSC does not solve the event-ordering problem (because the events from the /relations API are returned in "topological" order) we no longer believe it is important, except as a performance optimisation.
Code-level causes
Code-level causes
We have found and fixed several bugs in the Element Web code that were caused by an incomplete understanding of the meaning of threaded and unthreaded read receipts. We anticipate that some more exist.
(We believe that the primary reason why we're not seeing the same problems on mobile is that the apps persist events they've received whereas Element Web has to re-fetch from scratch after every launch. As a result, any issue in the unread state logic, strikes again and again. The apps also use a single timeline whereas Element Web maintains one timeline per thread in addition to the main timeline in every room.)
High-level plan of actions
- Set up POC branch / deployment with threads disabled – Experiment with putting threads back behind a beta toggle #25676
- Fix read-receipt behavior around thread roots – Fix read receipt sending behaviour around thread roots matrix-org/matrix-js-sdk#3600
- Fix read-receipt behavior around non-thread relations to thread roots – Fix edge cases around non-thread relations to thread roots and read receipts matrix-org/matrix-js-sdk#3607
- Fix missing message issues due to replies to unknown events – Fix how
Room::eventShouldLiveInhandles replies to unknown parents matrix-org/matrix-js-sdk#3615 - Fix unread count returning from zero after reload – Message in a thread is read and becomes unread when I refresh #25806
- Fix notification counters sometimes being doubled – Notification count is doubled in e2ee rooms on startup #25803
- Set up unread & notification test suite – Test cases for stuck unreads and notifs #25449
- Unthreaded receipt is not returned in the next syncs, confirm bug and get fixed - Unthreaded receipt is not returned in the next syncs synapse#17247
- Pre-investigate issues listed after the threads model refactoring to see if any of them are fixable before
- Refactor threads model code and remove serverless / legacy code path to simplify further changes
- Fix missing notifications – Thread replies sent from Element Android cause neither unread count nor unread dot on Element Desktop #25621
- Fix incorrect thread reply count – Reply count in thread is wrong #24636
- Fix notification counters not adding up between spaces and rooms – Notification badges in left panel contradict each other #20372
- Fix unread counter explosion – Unread count goes from zero through the roof after restarting the app #25479
- Fix Zombie notifications from old threads – ? issue ?
- Fix read receipts / fully read marker jumping backwards – ? issue ?
- Triage existing issues to increase confidence that the root causes will actually be fixed through the MSCs mentioned above.
- Fix remaining bugs found in no-threads POC as they likely impact the threaded experience as well – Experiment with putting threads back behind a beta toggle #25676
We believe a lot of progress can still be made without spec changes. So we're slightly deprioritising work on the MSCs.
- MSC4037: Thread root is not in the thread matrix-org/matrix-spec-proposals#4037
We claim it is implemented in EW
Synapse will need small fixes when this is merged - MSC4033: Explicit ordering of events for receipts matrix-org/matrix-spec-proposals#4033
Needs implementation on server and client but we believe it should mostly only cover edge cases - MSC4023: Thread ID for 2nd order-relation matrix-org/matrix-spec-proposals#4023
There is disagreement about the way to specify this and possible interference with MSC3051: A scalable relation format matrix-org/matrix-spec-proposals#3051 - MSC3389: Relation redactions matrix-org/matrix-spec-proposals#3389
There is likely no good workaround other than implementing this MSC
New issue inbox
The following is a holding area for newly reported issues that require review. Once reviewed, issues should either be moved to one of the other task lists below or, if not applicable, removed from this epic.
- Missing notifications badges #25331
- Stuck notifications due to unread threads not updating rapidly. #25420
- Room stuck as unread despite all messages/threads marked as read (to my knowledge) #25481
- Stuck notification after redacting thread root #25480
- Unread count goes from zero through the roof after restarting the app #25479
- https://github.com/matrix-org/element-web-rageshakes/issues/21717
- https://github.com/matrix-org/element-web-rageshakes/issues/21724
- Stuck notification case #25513
- https://github.com/matrix-org/element-web-rageshakes/issues/21737
- Stuck notifications due to edits in threads #25482
- Constant unread highlight notifications in one particular room #25528
- Starting nightly for the first time seems to trigger notification counter mess in normal Desktop #25541
- Thread replies sent from Element Android cause neither unread count nor unread dot on Element Desktop #25621
- Room with thread message not shown at top of list when „Show rooms with unread messages first“ is checked #24547
- Stuck unread when latest message is in a thread #25623
- Errors in console likely linked to stuck unreads #25642
- Persistent Threads dot when last message in a thread is redacted #25670
- Experiment with putting threads back behind a beta toggle #25676
- Permanent stuck unread-notification when starting a thread with a reply message #23976
- Read message is marked as unread #25804
- Message in a thread is read and becomes unread when I refresh #25806
- Read status of room resetting itself and not shared to other devices #24111
- impossible to dismiss notification #25408
- New threads/new thread messages always show "new messages" at already 20x times read other threads too #25907
- Audible ping but no badge on notifications for bot messages #25904
- Stuck unread dot on v1.11.38 #25929
- https://github.com/matrix-org/element-web-rageshakes/issues/22226
- https://github.com/matrix-org/element-web-rageshakes/issues/22363
- https://github.com/matrix-org/element-web-rageshakes/issues/22414
- Stuck notifications in a room with no posting permissions #25975
- Persistent stuck unread in encrypted room #25984
- New message count not clearing in DM #25950
- Stuck notification in DM #26063
- https://github.com/matrix-org/element-web-rageshakes/issues/22526
- Threads might skip processing some replies from the sync response matrix-org/matrix-js-sdk#3665
- Notification for unread redacted message in thread gets stuck #26933
- no way to mark notification as done for a chat which no longer exists. #30345
Tasks not blocked by spec work
- Stuck notifications: Create devtools room stuck notifications debug tool #24388
- Stuck notifications: Reacting to a not-latest message in a thread causes a stuck notification #24000
- Stuck notifications: DM stuck as unread #23991
- Stuck notifications: New DMs have stuck notifications #23685
- https://github.com/matrix-org/element-web-rageshakes/issues/21667
- Read messages reappear as "new" after restarting the app. #24629
- „Mark as read“ button does not send the correct receipt for the last event in a thread #25207
- Unthreaded read receipts are sent after reading the main timeline #25196
- Test how we handle receiving an unthreaded read receipt #25212
- Rooms incorrectly marked as unread #10954
- https://github.com/matrix-org/element-web-rageshakes/issues/21641
- Cannot mark room as read. #25411
- New reaction to old event ends up on the main timeline instead of the thread #25450
- Test cases for stuck unreads and notifs #25449
- Experiment with putting threads back behind a beta toggle #25676
- Missing reactions for some messages #25596
Tasks that are related to or dependent on spec work
We've written the following MSCs to try and address the root causes in a reliable and performant way:
- MSC4033: Explicit ordering of events for receipts matrix-org/matrix-spec-proposals#4033
A client-side workaround for this based on timestamp ordering has been implemented. This is imperfect and easily abused though. - MSC4023: Thread ID for 2nd order-relation matrix-org/matrix-spec-proposals#4023
A client-side workaround for this based on calling/eventto fetch the parent has been implemented. This is functionally correct but has a noticeable performance impact. - MSC4037: Thread root is not in the thread matrix-org/matrix-spec-proposals#4037
The client-side code has already been modeled to reflect the behavior proposed in the MSC (which is also what Synapse already does today). - MSC3981:
/relationsrecursion matrix-org/matrix-spec-proposals#3981
We expect this to not help with the ordering problems but it will be a performance improvement.
- Write / shepherd MSC3981 element-meta#1350
- Implement MSC3981 in Synapse matrix-org/synapse#15377
- Implement MSC3981 in matrix-react-sdk #25021
- Room stuck as unread after mention in a thread #24312
- Stuck unread with reaction + start thread #24394
- Can't get rid of unread status of a room #24595
- Thread not being marked as read #24442
- impossible to dismiss notification #25408
- https://github.com/matrix-org/element-web-rageshakes/issues/21679
- Stuck notifications due to edits in threads #25482
- Write / shepherd MSC4023 element-meta#1714
- Implement MSC4023: Thread ID for 2nd order-relation in Synapse matrix-org/synapse#15701
- Excessive calls to
GET /relationsandGET /event, with MSC3981 enabled and working #25395 - Stuck notifications when there's no new message after a thread #25893
Issues that are related but out of scope
Time sheeting
WEB: Stuck notifications