Line data Source code
1 : //! Defines [`RequestContext`].
2 : //!
3 : //! It is a structure that we use throughout the pageserver to propagate
4 : //! high-level context from places that _originate_ activity down to the
5 : //! shared code paths at the heart of the pageserver. It's inspired by
6 : //! Golang's `context.Context`.
7 : //!
8 : //! For example, in `Timeline::get(page_nr, lsn)` we need to answer the following questions:
9 : //! 1. What high-level activity ([`TaskKind`]) needs this page?
10 : //! We need that information as a categorical dimension for page access
11 : //! statistics, which we, in turn, need to guide layer eviction policy design.
12 : //! 2. How should we behave if, to produce the page image, we need to
13 : //! on-demand download a layer file ([`DownloadBehavior`]).
14 : //!
15 : //! [`RequestContext`] satisfies those needs.
16 : //! The current implementation is a small `struct` that is passed through
17 : //! the call chain by reference.
18 : //!
19 : //! ### Future Work
20 : //!
21 : //! However, we do not intend to stop here, since there are other needs that
22 : //! require carrying information from high to low levels of the app.
23 : //!
24 : //! Most importantly, **cancellation signaling** in response to
25 : //! 1. timeouts (page_service max response time) and
26 : //! 2. lifecycle requests (detach tenant, delete timeline).
27 : //!
28 : //! Related to that, there is sometimes a need to ensure that all tokio tasks spawned
29 : //! by the transitive callees of a request have finished. The keyword here
30 : //! is **Structured Concurrency**, and right now, we use `task_mgr` in most places,
31 : //! `TaskHandle` in some places, and careful code review around `FuturesUnordered`
32 : //! or `JoinSet` in other places.
33 : //!
34 : //! We do not yet have a systematic cancellation story in pageserver, and it is
35 : //! pretty clear that [`RequestContext`] will be responsible for that.
36 : //! So, the API already prepares for this role through the
37 : //! [`RequestContext::detached_child`] and [`RequestContext::attached_child`] methods.
38 : //! See their doc comments for details on how we will use them in the future.
39 : //!
40 : //! It is not clear whether or how we will enforce Structured Concurrency, and
41 : //! what role [`RequestContext`] will play there.
42 : //! So, the API doesn't prepare us for this topic.
43 : //!
44 : //! Other future uses of `RequestContext`:
45 : //! - Communicate compute & IO priorities (user-initiated request vs. background-loop)
46 : //! - Request IDs for distributed tracing
47 : //! - Request/Timeline/Tenant-scoped log levels
48 : //!
49 : //! RequestContext might look quite different once it supports those features.
50 : //! Likely, it will have a shape similar to Golang's `context.Context`.
51 : //!
52 : //! ### Why A Struct Instead Of Method Parameters
53 : //!
54 : //! What's typical about such information is that it needs to be passed down
55 : //! along the call chain from high level to low level, but few of the functions
56 : //! in the middle need to understand it.
57 : //! Further, it is to be expected that we will need to propagate more data
58 : //! in the future (see the earlier section on future work).
59 : //! Hence, for functions in the middle of the call chain, we have the following
60 : //! requirements:
61 : //! 1. It should be easy to forward the context to callees.
62 : //! 2. To propagate more data from high-level to low-level code, the functions in
63 : //! the middle should not need to be modified.
64 : //!
65 : //! The solution is to have a container structure ([`RequestContext`]) that
66 : //! carries the information. Functions that don't care about what's in it
67 : //! pass it along to callees.
68 : //!
69 : //! ### Why Not Task-Local Variables
70 : //!
71 : //! One could use task-local variables (the equivalent of thread-local variables)
72 : //! to address the immediate needs outlined above.
73 : //! However, we reject task-local variables because:
74 : //! 1. they are implicit, thereby making it harder to trace the data flow in code
75 : //! reviews and during debugging,
76 : //! 2. they can be mutable, which enables implicit return data flow,
77 : //! 3. they are restrictive in that code which fans out into multiple tasks,
78 : //! or even threads, needs to carefully propagate the state.
79 : //!
80 : //! In contrast, information flow with [`RequestContext`] is
81 : //! 1. always explicit,
82 : //! 2. strictly uni-directional because RequestContext is immutable,
83 : //! 3. tangible because a [`RequestContext`] is just a value.
84 : //! When creating child activities, regardless of whether it's a task,
85 : //! thread, or even an RPC to another service, the value can
86 : //! be used like any other argument.
87 : //!
88 : //! The solution is that all code paths are infected with precisely one
89 : //! [`RequestContext`] argument. Functions in the middle of the call chain
90 : //! only need to pass it on.
91 :
92 : use crate::task_mgr::TaskKind;
93 :
94 : // The main structure of this module, see module-level comment.
95 : #[derive(Debug)]
96 : pub struct RequestContext {
97 : task_kind: TaskKind,
98 : download_behavior: DownloadBehavior,
99 : access_stats_behavior: AccessStatsBehavior,
100 : page_content_kind: PageContentKind,
101 : read_path_debug: bool,
102 : }
103 :
104 : /// The kind of access to the page cache.
105 : #[derive(Clone, Copy, PartialEq, Eq, Debug, enum_map::Enum, strum_macros::IntoStaticStr)]
106 : pub enum PageContentKind {
107 : Unknown,
108 : DeltaLayerSummary,
109 : DeltaLayerBtreeNode,
110 : DeltaLayerValue,
111 : ImageLayerSummary,
112 : ImageLayerBtreeNode,
113 : ImageLayerValue,
114 : InMemoryLayer,
115 : }
116 :
117 : /// Desired behavior if the operation requires an on-demand download
118 : /// to proceed.
119 : #[derive(Clone, Copy, PartialEq, Eq, Debug)]
120 : pub enum DownloadBehavior {
121 : /// Download the layer file. It can take a while.
122 : Download,
123 :
124 : /// Download the layer file, but print a warning to the log. This should be used
125 : /// in code where the layer file is expected to already exist locally.
126 : Warn,
127 :
128 : /// Return a PageReconstructError::NeedsDownload error
129 : Error,
130 : }
131 :
132 : /// Whether this request should update access times used in LRU eviction
133 : #[derive(Clone, Copy, PartialEq, Eq, Debug)]
134 : pub(crate) enum AccessStatsBehavior {
135 : /// Update access times: this request's access to data should be taken
136 : /// as a hint that the accessed layer is likely to be accessed again
137 : Update,
138 :
139 : /// Do not update access times: this request is accessing the layer
140 : /// but does not want to indicate that the layer should be retained in cache,
141 : /// perhaps because the requestor is a compaction routine that will soon cover
142 : /// this layer with another.
143 : Skip,
144 : }
145 :
146 : pub struct RequestContextBuilder {
147 : inner: RequestContext,
148 : }
149 :
150 : impl RequestContextBuilder {
151 : /// A new builder with default settings
152 1533106 : pub fn new(task_kind: TaskKind) -> Self {
153 1533106 : Self {
154 1533106 : inner: RequestContext {
155 1533106 : task_kind,
156 1533106 : download_behavior: DownloadBehavior::Download,
157 1533106 : access_stats_behavior: AccessStatsBehavior::Update,
158 1533106 : page_content_kind: PageContentKind::Unknown,
159 1533106 : read_path_debug: false,
160 1533106 : },
161 1533106 : }
162 1533106 : }
163 :
164 1697203 : pub fn extend(original: &RequestContext) -> Self {
165 1697203 : Self {
166 1697203 : // This is like a Copy, but avoid implementing Copy because ordinary users of
167 1697203 : // RequestContext should always move or ref it.
168 1697203 : inner: RequestContext {
169 1697203 : task_kind: original.task_kind,
170 1697203 : download_behavior: original.download_behavior,
171 1697203 : access_stats_behavior: original.access_stats_behavior,
172 1697203 : page_content_kind: original.page_content_kind,
173 1697203 : read_path_debug: original.read_path_debug,
174 1697203 : },
175 1697203 : }
176 1697203 : }
177 :
178 : /// Configure the DownloadBehavior of the context: whether to
179 : /// download missing layers, and/or warn on the download.
180 1533106 : pub fn download_behavior(mut self, b: DownloadBehavior) -> Self {
181 1533106 : self.inner.download_behavior = b;
182 1533106 : self
183 1533106 : }
184 :
185 : /// Configure the AccessStatsBehavior of the context: whether layer
186 : /// accesses should update the access time of the layer.
187 728 : pub(crate) fn access_stats_behavior(mut self, b: AccessStatsBehavior) -> Self {
188 728 : self.inner.access_stats_behavior = b;
189 728 : self
190 728 : }
191 :
192 1696475 : pub(crate) fn page_content_kind(mut self, k: PageContentKind) -> Self {
193 1696475 : self.inner.page_content_kind = k;
194 1696475 : self
195 1696475 : }
196 :
197 0 : pub(crate) fn read_path_debug(mut self, b: bool) -> Self {
198 0 : self.inner.read_path_debug = b;
199 0 : self
200 0 : }
201 :
202 3230309 : pub fn build(self) -> RequestContext {
203 3230309 : self.inner
204 3230309 : }
205 : }
206 :
207 : impl RequestContext {
208 : /// Create a new RequestContext that has no parent.
209 : ///
210 : /// The function is called `new` because, once we add children
211 : /// to it using `detached_child` or `attached_child`, the context
212 : /// form a tree (not implemented yet since cancellation will be
213 : /// the first feature that requires a tree).
214 : ///
215 : /// # Future: Cancellation
216 : ///
217 : /// The only reason why a context like this one can be canceled is
218 : /// because someone explicitly canceled it.
219 : /// It has no parent, so it cannot inherit cancellation from there.
220 1533106 : pub fn new(task_kind: TaskKind, download_behavior: DownloadBehavior) -> Self {
221 1533106 : RequestContextBuilder::new(task_kind)
222 1533106 : .download_behavior(download_behavior)
223 1533106 : .build()
224 1533106 : }
225 :
226 : /// Create a detached child context for a task that may outlive `self`.
227 : ///
228 : /// Use this when spawning new background activity that should complete
229 : /// even if the current request is canceled.
230 : ///
231 : /// # Future: Cancellation
232 : ///
233 : /// Cancellation of `self` will not propagate to the child context returned
234 : /// by this method.
235 : ///
236 : /// # Future: Structured Concurrency
237 : ///
238 : /// We could add the Future as a parameter to this function, spawn it as a task,
239 : /// and pass to the new task the child context as an argument.
240 : /// That would be an ergonomic improvement.
241 : ///
242 : /// We could make new calls to this function fail if `self` is already canceled.
243 432 : pub fn detached_child(&self, task_kind: TaskKind, download_behavior: DownloadBehavior) -> Self {
244 432 : self.child_impl(task_kind, download_behavior)
245 432 : }
246 :
247 : /// Create a child of context `self` for a task that shall not outlive `self`.
248 : ///
249 : /// Use this when fanning-out work to other async tasks.
250 : ///
251 : /// # Future: Cancellation
252 : ///
253 : /// Cancelling a context will propagate to its attached children.
254 : ///
255 : /// # Future: Structured Concurrency
256 : ///
257 : /// We could add the Future as a parameter to this function, spawn it as a task,
258 : /// and track its `JoinHandle` inside the `RequestContext`.
259 : ///
260 : /// We could then provide another method to allow waiting for all child tasks
261 : /// to finish.
262 : ///
263 : /// We could make new calls to this function fail if `self` is already canceled.
264 : /// Alternatively, we could allow the creation but not spawn the task.
265 : /// The method to wait for child tasks would return an error, indicating
266 : /// that the child task was not started because the context was canceled.
267 1531106 : pub fn attached_child(&self) -> Self {
268 1531106 : self.child_impl(self.task_kind(), self.download_behavior())
269 1531106 : }
270 :
271 : /// Use this function when you should be creating a child context using
272 : /// [`attached_child`] or [`detached_child`], but your caller doesn't provide
273 : /// a context and you are unwilling to change all callers to provide one.
274 : ///
275 : /// Before we add cancellation, we should get rid of this method.
276 : ///
277 : /// [`attached_child`]: Self::attached_child
278 : /// [`detached_child`]: Self::detached_child
279 884 : pub fn todo_child(task_kind: TaskKind, download_behavior: DownloadBehavior) -> Self {
280 884 : Self::new(task_kind, download_behavior)
281 884 : }
282 :
283 1531538 : fn child_impl(&self, task_kind: TaskKind, download_behavior: DownloadBehavior) -> Self {
284 1531538 : Self::new(task_kind, download_behavior)
285 1531538 : }
286 :
287 3967630 : pub fn task_kind(&self) -> TaskKind {
288 3967630 : self.task_kind
289 3967630 : }
290 :
291 1531138 : pub fn download_behavior(&self) -> DownloadBehavior {
292 1531138 : self.download_behavior
293 1531138 : }
294 :
295 479440 : pub(crate) fn access_stats_behavior(&self) -> AccessStatsBehavior {
296 479440 : self.access_stats_behavior
297 479440 : }
298 :
299 1944592 : pub(crate) fn page_content_kind(&self) -> PageContentKind {
300 1944592 : self.page_content_kind
301 1944592 : }
302 :
303 0 : pub(crate) fn read_path_debug(&self) -> bool {
304 0 : self.read_path_debug
305 0 : }
306 : }
|