Commit ded471b9e785f697c3a6a49c606ce9dd88987405
Fixes typos/grammar making examples easier to read
Harry Sarson authored on 9/6/2017, 7:28:07 PMGitHub committed on 9/6/2017, 7:28:07 PM
Parent: 1d1966f84c7dd4bf7fb3b3897d376fc2603f309d
Files changed
docs/examples.md | changed |
docs/examples.md | ||
---|---|---|
@@ -4,88 +4,89 @@ | ||
4 | 4 … | |
5 | 5 … | Much of the focus here is handling the error cases. Indeed, |
6 | 6 … | distributed systems are _all about_ handling the error cases. |
7 | 7 … | |
8 | -# simple source that ends correctly. (read, end) | |
8 … | +# A simple source that ends correctly. (read, end) | |
9 | 9 … | |
10 | 10 … | A normal file (source) is read, and sent to a sink stream |
11 | -that computes some aggregation upon that input. | |
12 | -such as the number of bytes, or number of occurances of the `\n` | |
11 … | +that computes some aggregation upon that input such as | |
12 … | +the number of bytes, or number of occurances of the `\n` | |
13 | 13 … | character (i.e. the number of lines). |
14 | 14 … | |
15 | 15 … | The source reads a chunk of the file at each time it's called, |
16 | 16 … | there is some optimium size depending on your operating system, |
17 | 17 … | file system, physical hardware, |
18 | 18 … | and how many other files are being read concurrently. |
19 | 19 … | |
20 | -when the sink gets a chunk, it iterates over the characters in it | |
21 | -counting the `\n` characters. when the source returns `end` to the | |
20 … | +When the sink gets a chunk, it iterates over the characters in it | |
21 … | +counting the `\n` characters. When the source returns `end` to the | |
22 | 22 … | sink, the sink calls a user provided callback. |
23 | 23 … | |
24 | -# source that may fail. (read, err, end) | |
24 … | +# A source that may fail. (read, err, end) | |
25 | 25 … | |
26 | -download a file over http and write it to fail. | |
26 … | +A file is downloaded over http and written to a file. | |
27 | 27 … | The network should always be considered to be unreliable, |
28 | -and you must design your system to recover from failures. | |
29 | -So there for the download may fail (wifi cuts out or something) | |
28 … | +and you must design your system to recover if the download | |
29 … | +fails. (For example if the wifi were to cut out). | |
30 | 30 … | |
31 | 31 … | The read stream is just the http download, and the sink |
32 | -writes it to a tempfile. If the source ends normally, | |
33 | -the tempfile is moved to the correct location. | |
34 | -If the source errors, the tempfile is deleted. | |
32 … | +writes it to a tempory file. If the source ends normally, | |
33 … | +the tempory file is moved to the correct location. | |
34 … | +If the source errors, the tempory file is deleted. | |
35 | 35 … | |
36 | -(you could also write the file to the correct location, | |
37 | -and delete it if it errors, but the tempfile method has the advantage | |
38 | -that if the computer or process crashes it leaves only a tempfile | |
39 | -and not a file that appears valid. stray tempfiles can be cleaned up | |
40 | -or resumed when the process restarts) | |
36 … | +(You could also write the file to the correct location, | |
37 … | +and delete it if it errors, but the tempory file method has the advantage | |
38 … | +that if the computer or process crashes it leaves only a tempory file | |
39 … | +and not a file that appears valid. Stray tempory files can be cleaned up | |
40 … | +or resumed when the process restarts.) | |
41 | 41 … | |
42 | -# sink that may fail | |
42 … | +# A sink that may fail | |
43 | 43 … | |
44 | -If we read a file from disk, and upload it, | |
45 | -then it is the sink that may error. | |
46 | -The file system is probably faster than the upload, | |
47 | -so it will mostly be waiting for the sink to ask for more. | |
48 | -usually, the sink calls read, and the source gets more from the file | |
49 | -until the file ends. If the sink errors, it calls `read(true, cb)` | |
44 … | +If we read a file from disk, and upload it, then the upload is the sink that may error. | |
45 … | +The file system is probably faster than the upload and | |
46 … | +so it will mostly be waiting for the sink to ask for more data. | |
47 … | +Usually the sink calls `read(null, cb)` and the source retrives chunks of the file | |
48 … | +until the file ends. If the sink errors, it then calls `read(true, cb)` | |
50 | 49 … | and the source closes the file descriptor and stops reading. |
51 | 50 … | In this case the whole file is never loaded into memory. |
52 | 51 … | |
53 | -# sink that may fail out of turn. | |
52 … | +# A sink that may fail out of turn. | |
54 | 53 … | |
55 | 54 … | A http client connects to a log server and tails a log in realtime. |
56 | -(another process writes to the log file, | |
57 | -but we don't need to think about that) | |
55 … | +(Another process will write to the log file, | |
56 … | +but we don't need to worry about that.) | |
58 | 57 … | |
59 | -The source is the server log stream, and the sink is the client. | |
58 … | +The source is the server's log stream, and the sink is the client. | |
60 | 59 … | First the source outputs the old data, this will always be a fast |
61 | -response, because that data is already at hand. When that is all | |
62 | -written then the output rate may drop significantly because it will | |
63 | -wait for new data to be added to the file. Because of this, | |
64 | -it becomes much more likely that the sink errors (the network connection | |
60 … | +response, because that data is already at hand. When the old data is all | |
61 … | +written then the output rate may drop significantly because the server (the source) will | |
62 … | +wait for new data to be added to the file. Therefore, | |
63 … | +it becomes much more likely that the sink will error (for example if the network connection | |
65 | 64 … | drops) while the source is waiting for new data. Because of this, |
66 | 65 … | it's necessary to be able to abort the stream reading (after you called |
67 | 66 … | read, but before it called back). If it was not possible to abort |
68 | 67 … | out of turn, you'd have to wait for the next read before you can abort |
69 | -but, depending on the source of the stream, that may never come. | |
68 … | +but, depending on the source of the stream, the next read may never come. | |
70 | 69 … | |
71 | -# a through stream that needs to abort. | |
70 … | +# A through stream that needs to abort. | |
72 | 71 … | |
73 | -Say we read from a file (source), JSON parse each line (through), | |
72 … | +Say we wish to read from a file (source), parse each line as JSON (through), | |
74 | 73 … | and then output to another file (sink). |
75 | -because there is valid and invalid JSON, the parse could error, | |
76 | -if this parsing is a fatal error, then we are aborting the pipeline | |
77 | -from the middle. Here the source is normal, but then the through fails. | |
78 | -When the through finds an invalid line, it should abort the source, | |
74 … | +If the parser encounters illegal JSON then it will error and, | |
75 … | +if this parsing is a fatal error, then the parser needs to abort the pipeline | |
76 … | +from the middle. Here the source reads normaly, but then the through fails. | |
77 … | +When the through finds an invalid line, it should first abort the source, | |
79 | 78 … | and then callback to the sink with an error. This way, |
80 | 79 … | by the time the sink receives the error, the entire stream has been cleaned up. |
81 | 80 … | |
82 | -(you could abort the source, and error back to the sink in parallel, | |
83 | -but if something happened to the source while aborting, for the user | |
84 | -to know they'd have to give another callback to the source, this would | |
85 | -get called very rarely so users would be inclined to not handle that. | |
86 | -better to have one callback at the sink.) | |
81 … | +(You could abort the source and error back to the sink in parallel. | |
82 … | +However, if something happened to the source while aborting, for the user | |
83 … | +discover this error they would have to call the source again with another callback, as | |
84 … | +situation would occur only rarely users would be inclined to not handle it leading to | |
85 … | +the possiblity of undetected errors. | |
86 … | +Therefore, as it is better to have one callback at the sink, wait until the source | |
87 … | +has finished cleaning up before callingback to the pink with an error.) | |
87 | 88 … | |
88 | -In some cases you may want the stream to continue, and just ignore | |
89 | -an invalid line if it does not parse. An example where you definately | |
90 | -want to abort if it's invalid would be an encrypted stream, which | |
89 … | +In some cases you may want the stream to continue, and the the through stream can just ignore | |
90 … | +an any linesthat do not parse. An example where you definately | |
91 … | +want a through stream to abort on invalid input would be an encrypted stream, which | |
91 | 92 … | should be broken into chunks that are encrypted separately. |
Built with git-ssb-web