Files: 6b5ec2095160533416cd20a72f45e53f40849103 / README.md
ssb-wikimedia
Wikimedia to Secure Scuttlebutt bridge. Sync revisions from Wikimedia pages into SSB.
CLI
Usage: ssb-wikimedia-sync [-y] [-n] [-h] [<url>...]
Options:
-y Yes mode. Publish messages without prompting for confirmation.
-n Dry run. Output message draft JSON instead of publishing them.
-h Help mode. Output usage text and exit.
If URLs are not given on the command-line, they are read line-by-line from a file in the SSB directory: ~/.ssb/wikimedia-pages.txt
Lines in that file starting with "#" are considered comments and skipped. Blank lines are also skipped.
Schema
type: wikimedia/revisions
The message content is as returned from the Wikimedia Query/Revisions API, with the following change:
- Message type is added.
- Property "site" refers to the Wikimedia site base URL.
- Property
pageId
is a denormalization used to facilitate querying SSB for an article. It is the SSB blob hash of the values for the site property and title property, separated by a tab ("\t"). - Content ("*" property in revision slots) is replaced with the id of the SSB blob containing that content, at property "link".
- Property "parents" is an array of links to the latest previous message(s) of the same type for the same page containing previous revisions to the page. Any revision parent id referenced from the current message should be found in the current message or a message referenced in this parents array.
- Property "userId" is added to each revision object. It is computed the same as
pageId
but for the User page of the revision author.
Example:
{
"type": "wikimedia/revisions"
"pageid": 77777777,
"ns": 0,
"site": "https://en.example.org/",
"title": "Article Title",
"pageId": "&...sha256",
"parents": [
"%...sha256",
],
"revisions": [
{
"revid": 999999999,
"parentid": 999999998,
"user": "Username",
"userId": "&...sha256",
"timestamp": "2019-11-22T00:00:00Z",
"roles": [
"main"
],
"slots": {
"main": {
"size": 0,
"sha1": "da39a3ee5e6b4b0d3255bfef95601890afd80709",
"contentmodel": "wikitext",
"contentformat": "text/x-wiki",
"*": "&47DEQpj8HBSa+/TImW+5JCeuQeRkm5NMpJWZG3hSuFU=.sha256"
}
},
"comment": "Example edit summary",
"tags": []
}
]
}
Config
{
"wikimedia": {
"contact": "example@example.org",
"bot": false
}
}
wikimedia.contact
: contact info for the operator, to be included in the User-Agent string for requests to the Wikimedia API.wikimedia.bot
: whether to include "bot" in the User-Agent string. This should be set totrue
if you are running the command in an automated way.
References
- https://www.mediawiki.org/wiki/API:Query
- https://www.mediawiki.org/wiki/API:Revisions
- https://www.mediawiki.org/w/api.php?action=help&modules=query%2Brevisions
- https://meta.wikimedia.org/wiki/Special:MyLanguage/User-Agent_policy
License
Copyright (C) 2019 cel @f/6sQ6d...
This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.
You should have received a copy of the GNU Affero General Public License along with this program. If not, see http://www.gnu.org/licenses/.
Built with git-ssb-web