Line data Source code
1 : //! This module defines `RequestContext`, a structure that we use throughout
2 : //! the pageserver to propagate high-level context from places
3 : //! that _originate_ activity down to the shared code paths at the
4 : //! heart of the pageserver. It's inspired by Golang's `context.Context`.
5 : //!
6 : //! For example, in `Timeline::get(page_nr, lsn)` we need to answer the following questions:
7 : //! 1. What high-level activity ([`TaskKind`]) needs this page?
8 : //! We need that information as a categorical dimension for page access
9 : //! statistics, which we, in turn, need to guide layer eviction policy design.
10 : //! 2. How should we behave if, to produce the page image, we need to
11 : //! on-demand download a layer file ([`DownloadBehavior`]).
12 : //!
13 : //! [`RequestContext`] satisfies those needs.
14 : //! The current implementation is a small `struct` that is passed through
15 : //! the call chain by reference.
16 : //!
17 : //! ### Future Work
18 : //!
19 : //! However, we do not intend to stop here, since there are other needs that
20 : //! require carrying information from high to low levels of the app.
21 : //!
22 : //! Most importantly, **cancellation signaling** in response to
23 : //! 1. timeouts (page_service max response time) and
24 : //! 2. lifecycle requests (detach tenant, delete timeline).
25 : //!
26 : //! Related to that, there is sometimes a need to ensure that all tokio tasks spawned
27 : //! by the transitive callees of a request have finished. The keyword here
28 : //! is **Structured Concurrency**, and right now, we use `task_mgr` in most places,
29 : //! `TaskHandle` in some places, and careful code review around `FuturesUnordered`
30 : //! or `JoinSet` in other places.
31 : //!
32 : //! We do not yet have a systematic cancellation story in pageserver, and it is
33 : //! pretty clear that [`RequestContext`] will be responsible for that.
34 : //! So, the API already prepares for this role through the
35 : //! [`RequestContext::detached_child`] and [`RequestContext::attached_child`] methods.
36 : //! See their doc comments for details on how we will use them in the future.
37 : //!
38 : //! It is not clear whether or how we will enforce Structured Concurrency, and
39 : //! what role [`RequestContext`] will play there.
40 : //! So, the API doesn't prepare us for this topic.
41 : //!
42 : //! Other future uses of `RequestContext`:
43 : //! - Communicate compute & IO priorities (user-initiated request vs. background-loop)
44 : //! - Request IDs for distributed tracing
45 : //! - Request/Timeline/Tenant-scoped log levels
46 : //!
47 : //! RequestContext might look quite different once it supports those features.
48 : //! Likely, it will have a shape similar to Golang's `context.Context`.
49 : //!
50 : //! ### Why A Struct Instead Of Method Parameters
51 : //!
52 : //! What's typical about such information is that it needs to be passed down
53 : //! along the call chain from high level to low level, but few of the functions
54 : //! in the middle need to understand it.
55 : //! Further, it is to be expected that we will need to propagate more data
56 : //! in the future (see the earlier section on future work).
57 : //! Hence, for functions in the middle of the call chain, we have the following
58 : //! requirements:
59 : //! 1. It should be easy to forward the context to callees.
60 : //! 2. To propagate more data from high-level to low-level code, the functions in
61 : //! the middle should not need to be modified.
62 : //! The solution is to have a container structure ([`RequestContext`]) that
63 : //! carries the information. Functions that don't care about what's in it
64 : //! pass it along to callees.
65 : //!
66 : //! ### Why Not Task-Local Variables
67 : //!
68 : //! One could use task-local variables (the equivalent of thread-local variables)
69 : //! to address the immediate needs outlined above.
70 : //! However, we reject task-local variables because:
71 : //! 1. they are implicit, thereby making it harder to trace the data flow in code
72 : //! reviews and during debugging,
73 : //! 2. they can be mutable, which enables implicit return data flow,
74 : //! 3. they are restrictive in that code which fans out into multiple tasks,
75 : //! or even threads, needs to carefully propagate the state.
76 : //!
77 : //! In contrast, information flow with [`RequestContext`] is
78 : //! 1. always explicit,
79 : //! 2. strictly uni-directional because RequestContext is immutable,
80 : //! 3. tangible because a [`RequestContext`] is just a value.
81 : //! When creating child activities, regardless of whether it's a task,
82 : //! thread, or even an RPC to another service, the value can
83 : //! be used like any other argument.
84 : //!
85 : //! The solution is that all code paths are infected with precisely one
86 : //! [`RequestContext`] argument. Functions in the middle of the call chain
87 : //! only need to pass it on.
88 :
89 : use crate::task_mgr::TaskKind;
90 :
91 : // The main structure of this module, see module-level comment.
92 2 : #[derive(Clone, Debug)]
93 : pub struct RequestContext {
94 : task_kind: TaskKind,
95 : download_behavior: DownloadBehavior,
96 : access_stats_behavior: AccessStatsBehavior,
97 : page_content_kind: PageContentKind,
98 : }
99 :
100 : /// The kind of access to the page cache.
101 379105494 : #[derive(Clone, Copy, PartialEq, Eq, Debug, enum_map::Enum, strum_macros::IntoStaticStr)]
102 : pub enum PageContentKind {
103 : Unknown,
104 : DeltaLayerBtreeNode,
105 : DeltaLayerValue,
106 : ImageLayerBtreeNode,
107 : ImageLayerValue,
108 : InMemoryLayer,
109 : }
110 :
111 : /// Desired behavior if the operation requires an on-demand download
112 : /// to proceed.
113 2 : #[derive(Clone, Copy, PartialEq, Eq, Debug)]
114 : pub enum DownloadBehavior {
115 : /// Download the layer file. It can take a while.
116 : Download,
117 :
118 : /// Download the layer file, but print a warning to the log. This should be used
119 : /// in code where the layer file is expected to already exist locally.
120 : Warn,
121 :
122 : /// Return a PageReconstructError::NeedsDownload error
123 : Error,
124 : }
125 :
126 : /// Whether this request should update access times used in LRU eviction
127 23938763 : #[derive(Clone, Copy, PartialEq, Eq, Debug)]
128 : pub(crate) enum AccessStatsBehavior {
129 : /// Update access times: this request's access to data should be taken
130 : /// as a hint that the accessed layer is likely to be accessed again
131 : Update,
132 :
133 : /// Do not update access times: this request is accessing the layer
134 : /// but does not want to indicate that the layer should be retained in cache,
135 : /// perhaps because the requestor is a compaction routine that will soon cover
136 : /// this layer with another.
137 : Skip,
138 : }
139 :
140 : pub struct RequestContextBuilder {
141 : inner: RequestContext,
142 : }
143 :
144 : impl RequestContextBuilder {
145 : /// A new builder with default settings
146 44345 : pub fn new(task_kind: TaskKind) -> Self {
147 44345 : Self {
148 44345 : inner: RequestContext {
149 44345 : task_kind,
150 44345 : download_behavior: DownloadBehavior::Download,
151 44345 : access_stats_behavior: AccessStatsBehavior::Update,
152 44345 : page_content_kind: PageContentKind::Unknown,
153 44345 : },
154 44345 : }
155 44345 : }
156 :
157 120150400 : pub fn extend(original: &RequestContext) -> Self {
158 120150400 : Self {
159 120150400 : // This is like a Copy, but avoid implementing Copy because ordinary users of
160 120150400 : // RequestContext should always move or ref it.
161 120150400 : inner: RequestContext {
162 120150400 : task_kind: original.task_kind,
163 120150400 : download_behavior: original.download_behavior,
164 120150400 : access_stats_behavior: original.access_stats_behavior,
165 120150400 : page_content_kind: original.page_content_kind,
166 120150400 : },
167 120150400 : }
168 120150400 : }
169 :
170 : /// Configure the DownloadBehavior of the context: whether to
171 : /// download missing layers, and/or warn on the download.
172 44345 : pub fn download_behavior(mut self, b: DownloadBehavior) -> Self {
173 44345 : self.inner.download_behavior = b;
174 44345 : self
175 44345 : }
176 :
177 : /// Configure the AccessStatsBehavior of the context: whether layer
178 : /// accesses should update the access time of the layer.
179 1518 : pub(crate) fn access_stats_behavior(mut self, b: AccessStatsBehavior) -> Self {
180 1518 : self.inner.access_stats_behavior = b;
181 1518 : self
182 1518 : }
183 :
184 120148882 : pub(crate) fn page_content_kind(mut self, k: PageContentKind) -> Self {
185 120148882 : self.inner.page_content_kind = k;
186 120148882 : self
187 120148882 : }
188 :
189 120194745 : pub fn build(self) -> RequestContext {
190 120194745 : self.inner
191 120194745 : }
192 : }
193 :
194 : impl RequestContext {
195 : /// Create a new RequestContext that has no parent.
196 : ///
197 : /// The function is called `new` because, once we add children
198 : /// to it using `detached_child` or `attached_child`, the context
199 : /// form a tree (not implemented yet since cancellation will be
200 : /// the first feature that requires a tree).
201 : ///
202 : /// # Future: Cancellation
203 : ///
204 : /// The only reason why a context like this one can be canceled is
205 : /// because someone explicitly canceled it.
206 : /// It has no parent, so it cannot inherit cancellation from there.
207 44345 : pub fn new(task_kind: TaskKind, download_behavior: DownloadBehavior) -> Self {
208 44345 : RequestContextBuilder::new(task_kind)
209 44345 : .download_behavior(download_behavior)
210 44345 : .build()
211 44345 : }
212 :
213 : /// Create a detached child context for a task that may outlive `self`.
214 : ///
215 : /// Use this when spawning new background activity that should complete
216 : /// even if the current request is canceled.
217 : ///
218 : /// # Future: Cancellation
219 : ///
220 : /// Cancellation of `self` will not propagate to the child context returned
221 : /// by this method.
222 : ///
223 : /// # Future: Structured Concurrency
224 : ///
225 : /// We could add the Future as a parameter to this function, spawn it as a task,
226 : /// and pass to the new task the child context as an argument.
227 : /// That would be an ergonomic improvement.
228 : ///
229 : /// We could make new calls to this function fail if `self` is already canceled.
230 16184 : pub fn detached_child(&self, task_kind: TaskKind, download_behavior: DownloadBehavior) -> Self {
231 16184 : self.child_impl(task_kind, download_behavior)
232 16184 : }
233 :
234 : /// Create a child of context `self` for a task that shall not outlive `self`.
235 : ///
236 : /// Use this when fanning-out work to other async tasks.
237 : ///
238 : /// # Future: Cancellation
239 : ///
240 : /// Cancelling a context will propagate to its attached children.
241 : ///
242 : /// # Future: Structured Concurrency
243 : ///
244 : /// We could add the Future as a parameter to this function, spawn it as a task,
245 : /// and track its `JoinHandle` inside the `RequestContext`.
246 : ///
247 : /// We could then provide another method to allow waiting for all child tasks
248 : /// to finish.
249 : ///
250 : /// We could make new calls to this function fail if `self` is already canceled.
251 : /// Alternatively, we could allow the creation but not spawn the task.
252 : /// The method to wait for child tasks would return an error, indicating
253 : /// that the child task was not started because the context was canceled.
254 18130 : pub fn attached_child(&self) -> Self {
255 18130 : self.child_impl(self.task_kind(), self.download_behavior())
256 18130 : }
257 :
258 : /// Use this function when you should be creating a child context using
259 : /// [`attached_child`] or [`detached_child`], but your caller doesn't provide
260 : /// a context and you are unwilling to change all callers to provide one.
261 : ///
262 : /// Before we add cancellation, we should get rid of this method.
263 : ///
264 : /// [`attached_child`]: Self::attached_child
265 : /// [`detached_child`]: Self::detached_child
266 4425 : pub fn todo_child(task_kind: TaskKind, download_behavior: DownloadBehavior) -> Self {
267 4425 : Self::new(task_kind, download_behavior)
268 4425 : }
269 :
270 34314 : fn child_impl(&self, task_kind: TaskKind, download_behavior: DownloadBehavior) -> Self {
271 34314 : Self::new(task_kind, download_behavior)
272 34314 : }
273 :
274 450033113 : pub fn task_kind(&self) -> TaskKind {
275 450033113 : self.task_kind
276 450033113 : }
277 :
278 28643 : pub fn download_behavior(&self) -> DownloadBehavior {
279 28643 : self.download_behavior
280 28643 : }
281 :
282 23938763 : pub(crate) fn access_stats_behavior(&self) -> AccessStatsBehavior {
283 23938763 : self.access_stats_behavior
284 23938763 : }
285 :
286 379007706 : pub(crate) fn page_content_kind(&self) -> PageContentKind {
287 379007706 : self.page_content_kind
288 379007706 : }
289 : }
|