git ssb

2+

Dominic / pull-stream



Commit ded471b9e785f697c3a6a49c606ce9dd88987405

Fixes typos/grammar making examples easier to read

Harry Sarson authored on 9/6/2017, 7:28:07 PM
GitHub committed on 9/6/2017, 7:28:07 PM
Parent: 1d1966f84c7dd4bf7fb3b3897d376fc2603f309d

Files changed

docs/examples.mdchanged
docs/examples.mdView
@@ -4,88 +4,89 @@
44
55 Much of the focus here is handling the error cases. Indeed,
66 distributed systems are _all about_ handling the error cases.
77
8-# simple source that ends correctly. (read, end)
8 +# A simple source that ends correctly. (read, end)
99
1010 A normal file (source) is read, and sent to a sink stream
11-that computes some aggregation upon that input.
12-such as the number of bytes, or number of occurances of the `\n`
11 +that computes some aggregation upon that input such as
12 +the number of bytes, or number of occurances of the `\n`
1313 character (i.e. the number of lines).
1414
1515 The source reads a chunk of the file at each time it's called,
1616 there is some optimium size depending on your operating system,
1717 file system, physical hardware,
1818 and how many other files are being read concurrently.
1919
20-when the sink gets a chunk, it iterates over the characters in it
21-counting the `\n` characters. when the source returns `end` to the
20 +When the sink gets a chunk, it iterates over the characters in it
21 +counting the `\n` characters. When the source returns `end` to the
2222 sink, the sink calls a user provided callback.
2323
24-# source that may fail. (read, err, end)
24 +# A source that may fail. (read, err, end)
2525
26-download a file over http and write it to fail.
26 +A file is downloaded over http and written to a file.
2727 The network should always be considered to be unreliable,
28-and you must design your system to recover from failures.
29-So there for the download may fail (wifi cuts out or something)
28 +and you must design your system to recover if the download
29 +fails. (For example if the wifi were to cut out).
3030
3131 The read stream is just the http download, and the sink
32-writes it to a tempfile. If the source ends normally,
33-the tempfile is moved to the correct location.
34-If the source errors, the tempfile is deleted.
32 +writes it to a tempory file. If the source ends normally,
33 +the tempory file is moved to the correct location.
34 +If the source errors, the tempory file is deleted.
3535
36-(you could also write the file to the correct location,
37-and delete it if it errors, but the tempfile method has the advantage
38-that if the computer or process crashes it leaves only a tempfile
39-and not a file that appears valid. stray tempfiles can be cleaned up
40-or resumed when the process restarts)
36 +(You could also write the file to the correct location,
37 +and delete it if it errors, but the tempory file method has the advantage
38 +that if the computer or process crashes it leaves only a tempory file
39 +and not a file that appears valid. Stray tempory files can be cleaned up
40 +or resumed when the process restarts.)
4141
42-# sink that may fail
42 +# A sink that may fail
4343
44-If we read a file from disk, and upload it,
45-then it is the sink that may error.
46-The file system is probably faster than the upload,
47-so it will mostly be waiting for the sink to ask for more.
48-usually, the sink calls read, and the source gets more from the file
49-until the file ends. If the sink errors, it calls `read(true, cb)`
44 +If we read a file from disk, and upload it, then the upload is the sink that may error.
45 +The file system is probably faster than the upload and
46 +so it will mostly be waiting for the sink to ask for more data.
47 +Usually the sink calls `read(null, cb)` and the source retrives chunks of the file
48 +until the file ends. If the sink errors, it then calls `read(true, cb)`
5049 and the source closes the file descriptor and stops reading.
5150 In this case the whole file is never loaded into memory.
5251
53-# sink that may fail out of turn.
52 +# A sink that may fail out of turn.
5453
5554 A http client connects to a log server and tails a log in realtime.
56-(another process writes to the log file,
57-but we don't need to think about that)
55 +(Another process will write to the log file,
56 +but we don't need to worry about that.)
5857
59-The source is the server log stream, and the sink is the client.
58 +The source is the server's log stream, and the sink is the client.
6059 First the source outputs the old data, this will always be a fast
61-response, because that data is already at hand. When that is all
62-written then the output rate may drop significantly because it will
63-wait for new data to be added to the file. Because of this,
64-it becomes much more likely that the sink errors (the network connection
60 +response, because that data is already at hand. When the old data is all
61 +written then the output rate may drop significantly because the server (the source) will
62 +wait for new data to be added to the file. Therefore,
63 +it becomes much more likely that the sink will error (for example if the network connection
6564 drops) while the source is waiting for new data. Because of this,
6665 it's necessary to be able to abort the stream reading (after you called
6766 read, but before it called back). If it was not possible to abort
6867 out of turn, you'd have to wait for the next read before you can abort
69-but, depending on the source of the stream, that may never come.
68 +but, depending on the source of the stream, the next read may never come.
7069
71-# a through stream that needs to abort.
70 +# A through stream that needs to abort.
7271
73-Say we read from a file (source), JSON parse each line (through),
72 +Say we wish to read from a file (source), parse each line as JSON (through),
7473 and then output to another file (sink).
75-because there is valid and invalid JSON, the parse could error,
76-if this parsing is a fatal error, then we are aborting the pipeline
77-from the middle. Here the source is normal, but then the through fails.
78-When the through finds an invalid line, it should abort the source,
74 +If the parser encounters illegal JSON then it will error and,
75 +if this parsing is a fatal error, then the parser needs to abort the pipeline
76 +from the middle. Here the source reads normaly, but then the through fails.
77 +When the through finds an invalid line, it should first abort the source,
7978 and then callback to the sink with an error. This way,
8079 by the time the sink receives the error, the entire stream has been cleaned up.
8180
82-(you could abort the source, and error back to the sink in parallel,
83-but if something happened to the source while aborting, for the user
84-to know they'd have to give another callback to the source, this would
85-get called very rarely so users would be inclined to not handle that.
86-better to have one callback at the sink.)
81 +(You could abort the source and error back to the sink in parallel.
82 +However, if something happened to the source while aborting, for the user
83 +discover this error they would have to call the source again with another callback, as
84 +situation would occur only rarely users would be inclined to not handle it leading to
85 +the possiblity of undetected errors.
86 +Therefore, as it is better to have one callback at the sink, wait until the source
87 +has finished cleaning up before callingback to the pink with an error.)
8788
88-In some cases you may want the stream to continue, and just ignore
89-an invalid line if it does not parse. An example where you definately
90-want to abort if it's invalid would be an encrypted stream, which
89 +In some cases you may want the stream to continue, and the the through stream can just ignore
90 +an any linesthat do not parse. An example where you definately
91 +want a through stream to abort on invalid input would be an encrypted stream, which
9192 should be broken into chunks that are encrypted separately.

Built with git-ssb-web