Commit ded471b9e785f697c3a6a49c606ce9dd88987405
Fixes typos/grammar making examples easier to read
Harry Sarson authored on 9/6/2017, 7:28:07 PMGitHub committed on 9/6/2017, 7:28:07 PM
Parent: 1d1966f84c7dd4bf7fb3b3897d376fc2603f309d
Files changed
docs/examples.md | changed |
docs/examples.md | |||
---|---|---|---|
@@ -4,88 +4,89 @@ | |||
4 | 4 … | ||
5 | 5 … | Much of the focus here is handling the error cases. Indeed, | |
6 | 6 … | distributed systems are _all about_ handling the error cases. | |
7 | 7 … | ||
8 | -# simple source that ends correctly. (read, end) | ||
8 … | +# A simple source that ends correctly. (read, end) | ||
9 | 9 … | ||
10 | 10 … | A normal file (source) is read, and sent to a sink stream | |
11 | -that computes some aggregation upon that input. | ||
12 | -such as the number of bytes, or number of occurances of the `\n` | ||
11 … | +that computes some aggregation upon that input such as | ||
12 … | +the number of bytes, or number of occurances of the `\n` | ||
13 | 13 … | character (i.e. the number of lines). | |
14 | 14 … | ||
15 | 15 … | The source reads a chunk of the file at each time it's called, | |
16 | 16 … | there is some optimium size depending on your operating system, | |
17 | 17 … | file system, physical hardware, | |
18 | 18 … | and how many other files are being read concurrently. | |
19 | 19 … | ||
20 | -when the sink gets a chunk, it iterates over the characters in it | ||
21 | -counting the `\n` characters. when the source returns `end` to the | ||
20 … | +When the sink gets a chunk, it iterates over the characters in it | ||
21 … | +counting the `\n` characters. When the source returns `end` to the | ||
22 | 22 … | sink, the sink calls a user provided callback. | |
23 | 23 … | ||
24 | -# source that may fail. (read, err, end) | ||
24 … | +# A source that may fail. (read, err, end) | ||
25 | 25 … | ||
26 | -download a file over http and write it to fail. | ||
26 … | +A file is downloaded over http and written to a file. | ||
27 | 27 … | The network should always be considered to be unreliable, | |
28 | -and you must design your system to recover from failures. | ||
29 | -So there for the download may fail (wifi cuts out or something) | ||
28 … | +and you must design your system to recover if the download | ||
29 … | +fails. (For example if the wifi were to cut out). | ||
30 | 30 … | ||
31 | 31 … | The read stream is just the http download, and the sink | |
32 | -writes it to a tempfile. If the source ends normally, | ||
33 | -the tempfile is moved to the correct location. | ||
34 | -If the source errors, the tempfile is deleted. | ||
32 … | +writes it to a tempory file. If the source ends normally, | ||
33 … | +the tempory file is moved to the correct location. | ||
34 … | +If the source errors, the tempory file is deleted. | ||
35 | 35 … | ||
36 | -(you could also write the file to the correct location, | ||
37 | -and delete it if it errors, but the tempfile method has the advantage | ||
38 | -that if the computer or process crashes it leaves only a tempfile | ||
39 | -and not a file that appears valid. stray tempfiles can be cleaned up | ||
40 | -or resumed when the process restarts) | ||
36 … | +(You could also write the file to the correct location, | ||
37 … | +and delete it if it errors, but the tempory file method has the advantage | ||
38 … | +that if the computer or process crashes it leaves only a tempory file | ||
39 … | +and not a file that appears valid. Stray tempory files can be cleaned up | ||
40 … | +or resumed when the process restarts.) | ||
41 | 41 … | ||
42 | -# sink that may fail | ||
42 … | +# A sink that may fail | ||
43 | 43 … | ||
44 | -If we read a file from disk, and upload it, | ||
45 | -then it is the sink that may error. | ||
46 | -The file system is probably faster than the upload, | ||
47 | -so it will mostly be waiting for the sink to ask for more. | ||
48 | -usually, the sink calls read, and the source gets more from the file | ||
49 | -until the file ends. If the sink errors, it calls `read(true, cb)` | ||
44 … | +If we read a file from disk, and upload it, then the upload is the sink that may error. | ||
45 … | +The file system is probably faster than the upload and | ||
46 … | +so it will mostly be waiting for the sink to ask for more data. | ||
47 … | +Usually the sink calls `read(null, cb)` and the source retrives chunks of the file | ||
48 … | +until the file ends. If the sink errors, it then calls `read(true, cb)` | ||
50 | 49 … | and the source closes the file descriptor and stops reading. | |
51 | 50 … | In this case the whole file is never loaded into memory. | |
52 | 51 … | ||
53 | -# sink that may fail out of turn. | ||
52 … | +# A sink that may fail out of turn. | ||
54 | 53 … | ||
55 | 54 … | A http client connects to a log server and tails a log in realtime. | |
56 | -(another process writes to the log file, | ||
57 | -but we don't need to think about that) | ||
55 … | +(Another process will write to the log file, | ||
56 … | +but we don't need to worry about that.) | ||
58 | 57 … | ||
59 | -The source is the server log stream, and the sink is the client. | ||
58 … | +The source is the server's log stream, and the sink is the client. | ||
60 | 59 … | First the source outputs the old data, this will always be a fast | |
61 | -response, because that data is already at hand. When that is all | ||
62 | -written then the output rate may drop significantly because it will | ||
63 | -wait for new data to be added to the file. Because of this, | ||
64 | -it becomes much more likely that the sink errors (the network connection | ||
60 … | +response, because that data is already at hand. When the old data is all | ||
61 … | +written then the output rate may drop significantly because the server (the source) will | ||
62 … | +wait for new data to be added to the file. Therefore, | ||
63 … | +it becomes much more likely that the sink will error (for example if the network connection | ||
65 | 64 … | drops) while the source is waiting for new data. Because of this, | |
66 | 65 … | it's necessary to be able to abort the stream reading (after you called | |
67 | 66 … | read, but before it called back). If it was not possible to abort | |
68 | 67 … | out of turn, you'd have to wait for the next read before you can abort | |
69 | -but, depending on the source of the stream, that may never come. | ||
68 … | +but, depending on the source of the stream, the next read may never come. | ||
70 | 69 … | ||
71 | -# a through stream that needs to abort. | ||
70 … | +# A through stream that needs to abort. | ||
72 | 71 … | ||
73 | -Say we read from a file (source), JSON parse each line (through), | ||
72 … | +Say we wish to read from a file (source), parse each line as JSON (through), | ||
74 | 73 … | and then output to another file (sink). | |
75 | -because there is valid and invalid JSON, the parse could error, | ||
76 | -if this parsing is a fatal error, then we are aborting the pipeline | ||
77 | -from the middle. Here the source is normal, but then the through fails. | ||
78 | -When the through finds an invalid line, it should abort the source, | ||
74 … | +If the parser encounters illegal JSON then it will error and, | ||
75 … | +if this parsing is a fatal error, then the parser needs to abort the pipeline | ||
76 … | +from the middle. Here the source reads normaly, but then the through fails. | ||
77 … | +When the through finds an invalid line, it should first abort the source, | ||
79 | 78 … | and then callback to the sink with an error. This way, | |
80 | 79 … | by the time the sink receives the error, the entire stream has been cleaned up. | |
81 | 80 … | ||
82 | -(you could abort the source, and error back to the sink in parallel, | ||
83 | -but if something happened to the source while aborting, for the user | ||
84 | -to know they'd have to give another callback to the source, this would | ||
85 | -get called very rarely so users would be inclined to not handle that. | ||
86 | -better to have one callback at the sink.) | ||
81 … | +(You could abort the source and error back to the sink in parallel. | ||
82 … | +However, if something happened to the source while aborting, for the user | ||
83 … | +discover this error they would have to call the source again with another callback, as | ||
84 … | +situation would occur only rarely users would be inclined to not handle it leading to | ||
85 … | +the possiblity of undetected errors. | ||
86 … | +Therefore, as it is better to have one callback at the sink, wait until the source | ||
87 … | +has finished cleaning up before callingback to the pink with an error.) | ||
87 | 88 … | ||
88 | -In some cases you may want the stream to continue, and just ignore | ||
89 | -an invalid line if it does not parse. An example where you definately | ||
90 | -want to abort if it's invalid would be an encrypted stream, which | ||
89 … | +In some cases you may want the stream to continue, and the the through stream can just ignore | ||
90 … | +an any linesthat do not parse. An example where you definately | ||
91 … | +want a through stream to abort on invalid input would be an encrypted stream, which | ||
91 | 92 … | should be broken into chunks that are encrypted separately. |
Built with git-ssb-web