Paper Wikipedia

From Meta, a Wikimedia project coordination wiki


Untitled[edit]

A paper edition of Wikipedia is probably a long way off.

It is a high priority of the simple ideology of Wikitax.
It would presumably be implemented via Larry's sifter project.
There are already Wikipedia pages printed off and floating around the world. <--examples?--> Maybe we should try to find them and see what use they're being put to? Like put a notice on the whitespace saying "Are you reading this on paper? If so let us know at..." Then we can find out where paper versions are being used.
Absolutely—putting a unique timestamp (like the IP-stamp) on the page would make it easy to know which version they were looking at, too. Easy to track.
With CSS, it's pretty simple to stick in a block of text that is only displayed in the print version (so it's not shown on the screen). -- Wapcaplet 16:23 1 July 2003 (UTC)
Speaking personally, when I print material from Wikipedia I usually just highlight the text I want and print only that. I usually use printing to extract and keep particular sections from long articles, or whole articles if I think they might slip my memory or I want that particular version against any subsequent changes.

There are some things we can do now that might simplify the gargantuan task of condensing and sorting that creating a paper edition would involve.

Alphabetical tag[edit]

add a tag for each article that says where should it be placed in an alphabetically ordered list? For example, in Bill Clinton the tag would be something like <alpha>Clinton, Bill</alpha> or something like that.

For persons, this is much easier to do with a proper person DTD. Then you simply say <person>Clinton, Bill</person> and part of the semantics of a person tag is that last names are first, first names and titles second, or something—whatever works. For locations, like say ecoregions or nations in a spacetime DTD, e.g. <ecoregion>Sahara Desert</ecoregion>, <nation>Union of Soviet Socialist Republics</nation> can do the same job, and conventions for ordering by map location or period of history can be followed automatically. For everything else, though, you still need an <alpha> tag. The reason for semantic tags is to mark out things with special ordering requirements and relationship requirements (human lives; geographic places have this problem very obviously.).
Academic fields likewise, as "cognitive psychology" belongs under "psychology" as a subitem, not under 'c' with its master under 'p'. But those are things we have under control, mostly, whereas persons and spacetime are just too numerous to deal with without automated tools. Also paper biographies, paper atlases, are immensely useful documents, more so than most other compilations I know.

Distinguish "good" redirects[edit]

Most redirects we don't want on paper

  • misspellings
  • CamelCase pages
  • alternate spellings that are very close, e.g. Glycerine/Glycerin

But there are some that we MAY want to read "see..." (depending on space in the future paper edition). Suggest creating a new tag that is a synonym of #REDIRECT to mark these.

Note also that we are supposed to boldface all legitimate alternate terms for the same concept—so looking for the boldfaced terms should in theory give you the list of "good redirects", and allow these tags to be automatically generated. If not, the article needs to be fixed to boldface the alternates.
It seems there's a pretty well-developed w:Category:Redirects already...I've gone ahead and added w:Category:Unprintworthy redirects and w:Category:Printworthy redirects, after investing a little time in figuring out how they worked out their current system. Anyone care to write a bot for the task of winnowing between these categories? Prior categorization & closeness in alphabetizatoin might be helpful data in that task...--Polyparadigm 06:08, 3 October 2005 (UTC)[reply]
Reworking the categorization templates has really populated the list of printworthy & unprintworthy redirects, without the need for a bot. The best news is, proper categorization using the old standards will probably sort out a high percentage of them. It seems, though, that w:Category:Redirects for names will need some special attention, as these redirects should work backward in print. By the way, creating such redirects in a thorough manner and with proper categorizaion will also populate the list of person-name articles, obviating the need for in-article metadata: We know that the subject of the redirect, and the first phrase in bold within that article, are names.--Polyparadigm 22:45, 6 October 2005 (UTC)[reply]

my answer from wikitech-l:

No, please make a different project for this. I already hate the interlanguage links, we don't need more of this meta stuff in the article source text.

So, we are using only free text techniques to figure out that all references to "The President" from 1992 to 2000 are to "Bill Clinton". That really stinks. I think meta stuff is essential for WHO including all titles, and for WHEN+WHERE including not just political but climate and ecology constructs that are stable or cyclic, like "El Nino" or "Amazon Rainforest" (same as "Sahara Desert"). Doing this right makes it really easy to print "all articles on Brazil" etc.—and know that El Nino and Amazon Rainforest are about Brazil, and Sahara Desert is not.

I don't want to know how many people hit on "Edit", see the interlanguage links (ru:Мамонт ...) and think that it's too complicated to edit articles. --Kurt Jansson 01:34 Jan 24, 2003 (UTC)

If the tags are self-evident, not a problem. Most people who edit text now understand WP or Word or HTML conventions, and most high school kids will be able to hack XML the day they get out, very soon. It's not really that hard.
You know, where we do have these meta-information tags, why don't we just put them at the *bottom* of the page? For some reason we standardized on putting language links at the top; I think that was a mistake. I would recommend moving them to the bottom, where they're out of the way for the non-techies; likewise for proposed category tagging if we ever end up with such things. Or alternatively we can have an entirely separate interface for meta-linking, but that's another level of complexity added. --Brion VIBBER 02:51 Jan 24, 2003 (UTC)
For the language links, a pull-down menu. When there are a lot of languages it looks not so ugly, it is better for blind users. Giskart 18:00 Jan 24, 2003 (UTC)
Yes, agreed, since you just don't do it that often. Once you start reading in another language, you are in another name space and should stay in it.

Ok i am for having a paper wikipedia on one condition. That it is made from recycled paper! -fonzy

seconded, whatever Jimbo says about "green is bad" -- Tarquin
I'd like to see a plastic edition. lysdexia 16:53, 8 Nov 2004 (UTC)

As I already wrote - technical an XML syntax of the current syntax is one of the suppositions to create a paper version. Then convert it to w:DocBook and create a PDF with the DocBook Stylesheets. -- Nichtich

Yup, and the other DTDs for spacetime and people must come next, before any attempt to implement the Wikitax proposal. If we don't have sane conventions at least for the experienced editors and for machine-generated files, it will be too late by then. It's already the case that you have to know a great deal about naming conventions to correctly name monarchs, refer to towns or villages or nations, and so it's not much of a reach to rigorize the already-rigorous conventions for this into XML.

I propose that if a paper wikipedia is ever created, that lists of names appear in alphabetical order by last name and that each name be written in {surname}, {name} syntax. The current convention of {Name} {Surname} makes searching a list harder. There is already discussion about this in many places. Adding the possibilities of a paper wikipedia to the discussion changes some things. Robert Lee



Format of a paper Wikipedia[edit]

Clearly, a single-volume paper "raw" English-language Wikipedia is impossible, as we now (2004) have more far more raw content than the Encylopædia Britannica. Even after a filtering/tagging pass to eliminate the dross, we will still have similar or greater content. How can this be published on paper?

Articles could be ranked by frequency of pagehits. An article that is often requested on wikipedia.org is obviously an important article. This does not necessarily mean that it is a well-written article, but you should expect that it is also looked up quite often in a printed encyclopedia. This way, articles that have few pagehits on wikipedia.org could be excluded from the printed edition. -- Robamler 11:37, 26 Aug 2004 (UTC)

I don't really like the Britannica way, which is a micropedia/macropedia split, together with index volumes.

However, we could do what Brockhaus does, which is to produce a number of different formats from the same material.

Actually, Brockhaus has different formats from different text sources nowadays, iirc. -- Mathias Schindler 00:29, 16 Jul 2004 (UTC)

Hub and spoke model[edit]

We could produce:

  • a thick one-volume encyclopedia, "Wikipedia <year>", with only the "top level" articles, perhaps 200 to 400 pages of:
    • countries
    • elements
    • top level coverage only of specialist subjects
    • important historical figures

In particular, this should aim to target slowly moving information like histories of things, physical constants, geography, and so on so that its utility is preserved even if a given copy is a bit out-of-date.

  • a series of themed single-subject encyclopedias (maths, physics, history, society, etc.) entitled "Wikipedia: Mathematics <year>" and so on, which would duplicate the "top level" articles in that subject in the core volume, as well as having all the detailed articles on that subject

Thus,

  • the one-volume work would stand alone for those with minimal needs such as quick reference and early education
  • individual single-subject volumes would stand alone in their subject
  • the core volume and one or more single-subject volumes begin to round out your knowledge...
  • the complete set is the complete Wikipedia!

Advantages:

  • a great many links in specialist articles will be within the same volume
  • non-specialist links are generally to the "top level" volume
  • a partial set is partially useful, as opposed to a partial alphabetical encyclopedia, where the gaps are crippling and arbitrary
    • thus, it can be collected a volume at a time depending on a person's interest level and available resources, rather than having the "collect the set" principle beloved of encyclopedia salesmen in the 20th century
    • example: I might get a set consisting of Core, Maths, Physics. As I get more interested in history, I buy the History volume. Then work requirements push me into getting the Computer Science and Telecommunications volume...
I really like this idea. I foresee only one problem for those who get physics without math for instance, which is that the physics articles would constantly reference potentially non-top-level math articles. For instance, en:four-vector cites en:Lorentz group which is a Lie group. In some fields it would be impossible to untangle them. Laurascudder
If this is done by Category then perhaps articles could just appear as part of more than one subject-specific printed wikipedia. And then some of the disambig pages might not be needed. 203.218.234.235 15:07, 12 January 2006 (UTC)[reply]
I think only Good and Featured articles should be included. Just to save space because there's like a million articles and we can only pick about 3,000.

Mechanism[edit]

We can do all of this using the category system. By categorizing articles by subject, and having a "top level" flag that identifies them for inclusion in the top-level volume, we should be able to do all of the above.

-- The Anome 10:20, 13 May 2004 (UTC)[reply]


Only Good Articles and Featured Articles, to save paper. 68.173.113.106 02:46, 25 January 2012 (UTC)[reply]

Question: Acid-free paper?[edit]

Ideally, books should last for at least 100 years, not the 10-20 years that poor-quality paper lasts. A proper paper Wikipedia should be part of the library stacks that may well outlast much of today's digital data. What are the tradeoffs of using recycled paper vs. acid-free paper? -- The Anome 10:27, 13 May 2004 (UTC)[reply]

I think I know the answer, but is there acid-free recycled papers? -- user:zanimum
Nope, because the very process of recycling paper mostly involves breaking them down with acid. 70.29.184.247 17:41, 30 October 2005 (UTC)[reply]
Maybe there can be any neutralization process after? --Puzzlet Chung 17:46, 30 October 2005 (UTC)[reply]
You can buy acid free, chlorine free, 100% post consumer product recycled paper at pretty much anywhere that sells paper; Staples has 10 reams of it for $45.
Isn't that the last thing you want for a printed works of Wikipedia? After all, Wikipedia is supposed to be constantly fresh and dynamic, unlike a regular, printed encyclopedia. It'd make more sense if it were made of paper which bio-degraded within a decade - ideally, there'd be new copies coming out and replacing it well before then.
Books printed on biodegradable paper, forcing the owners to replace them every 10 years or so? What is this world coming to? Planned obsolescence!

Suggestion: Printing separate articles[edit]

I was looking for a way to print a couple of math articles when it occurred to me that it would be possible create a PDF-version of the article formatted with latex.

I, among several other people, have a need to print and/or save sections of a MediaWiki. I see no great need to be able to do this from the WWW interface, but is there any chance that somebody migh be able to re-use some of the MediaWiki code and generate documents, in HTML or perhaps LaTeX directly from the mySQL database?

Proposal[edit]

Wikalmanac, essentially a compilation of all w:fact sheets known to man, and alien lysdexia 16:53, 8 Nov 2004 (UTC)

Ahoy. I had a similar idea the other day, of a Wikicalander; some kind of paper calendar which contains one topic per day; not entirely in detail, but rather a short abstract on the subject, and the interested reader could turn to his nearest internet connection for the full article - this would be a simple way to let the Paper Wikipedia Madness™ ensue. These could be made in themes, e.g. animals, science, mathematics, for kids, or whatever, and could be pretty interesting. I'd like one on my wall! --Smári McCarthy 23:25, 26 January 2006 (UTC)[reply]
Interesting idea. I just got a one-a-day calendar written by a someone who read the Encyclopedia Britannica, and then wrote 365 collections of odd facts. So the Paper Wikipedia Madness™ could be abstracts on topics (within a theme, of course) that are not part of common knowledge, rather than abstracts on commonplace subjects. A short-term solution to the Paper Wikipedia difficulties. This could significantly increase Wikipedia online hits as well! --Iowacrusader 19:22, 1 February 2006 (UTC)[reply]

It would be nice to set up an open-source printer utility, similar to ClickBook, in order to print pages in a way that they can easily be bound into trade paperbacks (like so) or even sewn books, using an ordinary printer. It wouldn't be hard to take the document's length and compute the proper number of pages per quire.

Also, since the binding itself is a fairly large proportion of the expense, it would be nice make folded, but uncut, signatures commercially available. We could include a short booklet on bookbinding, reminiscent of this but perhaps more detailed and geared to greater durability. A school, for instance, could buy signatures and run a bookbinding class, sewing up a set of encyclopedias each year. This might also allow consumers to select what topics each copy of the book contains, though of course that would require modular page-numbering and a laser-printed TOC page...admittedly, this last one is a fairly wild idea.Polyparadigm 20:44, 2 October 2005 (UTC)[reply]

Update: I found an extensive article on bookbinding here, and I've gotten in touch with him about the possibility of licensing his material for use here and/or collaborating on a bookbinding wikibook.--Polyparadigm 14:51, 5 October 2005 (UTC)[reply]

Probably, it would be more appropriate to release a CD/DVD ISO file(s) of Wikipedia with an ftp and/or bittorrent access, so that every publishing house could easily print and distribute Wikipedia in paperback. Yeah, just as GNU/Linux CD/DVD distribution. This method is fully compatible with a GNU/FDL as preferred by the Wikipedia. Any copyright violations could easily be corrected by releasing the next version of the ISO file(s). That'll free Wikipedia of ANY content lawsuits and shift the responsibility of printing in paperback to the publishing houses, so they will be forced to publish Wikipedia content on a non-commercial basis (only to cover their overheads) or not to print it at all. And, finally, such ISO would be of a great value for the USERS (equally for readers and for writers) of Wikipedia. Once downloaded, it can be distributed via LAN and spread out the town so fast, one couldn't possibly imagine. Please, pardon me if I said something wrong/trivial in an inappropriate place. Thanx, --Yuriybrisk

See en:Wikipedia:Wikipedia-CD/Download

The only problem I can think of with this idea is the fact that publishing companies like to make money - they wouldn't bother printing something if they knew there wasn't any way to make money off of it, especially if they had to bear the responsibility for copyright violations and such. 65.29.83.197 16:28, 3 August 2006 (UTC)[reply]

The thing is, the first to offer a Wikipedia book could conceivably make a fair profit (which is fine). Thus, they do have an incentive to publish it. As others then (possibly) enter the market, prices could drop to costs. Essentially, it becomes a commodity. It's impossible to make profit over the long term in a commodity market, but lots of companies sell them anyway (otherwise they wouldn't be commodities). Superm401 | Talk 16:26, 10 December 2007 (UTC)[reply]

Related Projects[edit]

PediaPress offers individual books based on Wikipedia articles

PediaPress proposal[edit]

PediaPress currently donates a portion of every sale to the WMF. This presents an interesting opportunity to get some extra money for the WMF. We can sort the best articles (ostensibly through a Wikiproject), separate them into alphabetical categories, print en masse, then sell them over amazon and ebay. Once these sets go, the demand for them (hyped up by the belief that the articles in the book must have been "inspected") will bring it to the point that brick and mortar stores will stock it. Thoughts?

It's already being done[edit]

See the Publishing imprints section in the article VDM Publishing on English Wikipedia and the Slashdot post Print-On-Demand Publisher VDM Infects Amazon. Of course they can get away with it by citing the CC-BY-SA 3.0 License and the GFDL licenses, even though it's a borderline scam. Maybe the WMF should set up its own print on demand service. People could create their own book from selected articles and pay for the cost of printing and shipment with a small donation to the WMF. SpeakFree 18:01, 26 May 2011 (UTC)[reply]

w:Help:Books#Printed_books_from_PediaPress. πr2 (t • c) 13:39, 12 April 2013 (UTC)[reply]
Do a search on Google Books for inauthor:"Source Wikipedia. Championtalken-wiki 22:14, 24 October 2016 (UTC)[reply]

Offline releases[edit]

Wikipedia already makes a release version every now and then. Why not choose some of these articles and make them into a paper Wikipedia? Contests for cover page designs could be held. 68.173.113.106 00:47, 31 January 2012 (UTC)[reply]