Difference between revisions of "Help:Import"
| m (1 revision: Import helps) | 
| (No difference) | 
Latest revision as of 17:24, 10 January 2011
There are two types of import, both accessed through Special:Import:
- transwiki import, also called interwiki import: import pages directly from another wiki; the settings of the destination wiki determine which source wikis are enabled; Template:Msg appears; after "Transfer pages into namespace" one can specify a target namespace; the option "all" actually means "the same as the original".
- upload import: import a file in a special XML format produced by exporting pages from another wiki; Template:Msg appears;
On Wikimedia wikis upload import is disabled.
Transwiki import
On many Wikimedia wikis transwiki import is disabled too, it gives Template:Msg: "No wikis from which to import have been defined and direct history uploads are disabled." However, pages from commons:, foundation:, w:, cs and fr: can currently be imported to Meta, and pages from Meta can be imported to mw:. The act of importing is added to the page history and to Special:Log/import.
If an imported page has the same name as an existing page in the target wiki, the page is overwritten if the imported page is newer (according to the timestamps). If an error occurred during the import then you may find the import is partially complete (some pages imported, but not all). Since pages are overwritten, attempting the import again should not be a problem.
If you specified to include history information, then you should also see information about the edits in the 'history' of the imported pages, and in the user contributions. The edits will not show up in 'recent changes' (neither positioned at the time of the original edit, nor at the time of importing).
There is an option "Include all templates". One may want to edit the page first, such that when importing the page, only those templates are imported for which this is desired. For conveniently importing a collection of pages one can even specially create a page that transcludes them, and import that page, with the option on.
Useful applications of importing include:
- when a page is moved to another wiki and subsequently edited there, have the history together in the target wiki; this is especially useful if the source page becomes more difficult to find due to page moves etc.
- when a page is moved to another wiki and deleted on the source wiki, preserve the history
Upload import
How to export, and the format of exported pages, is described at Help:export. Normally any user can export wiki pages to a file, but to import pages into a wiki from a file, you must have 'Sysop' privileges on that wiki. So if you have your own MediaWiki installation, then you should be able to see the 'Special:Import' page there.
To import wiki pages from your computer, simply click browse to locate the file on your local file system.
Editing the import file
In the case of upload import, because of the simple readable file format the XML file can easily be edited between exporting and importing. This should be done with caution and integrity, one can make antedated edits and use false user names, and in combination with deletion, one can "change history". Applications of this editing include:
- adding a note to the edit summary about the importing
- changing user names and/or page names to avoid name conflicts (just between the title tags and between the username tags or also in links and signatures)
- changing namespace names into the generic or the applicable ones (ditto)
Note that if two versions of the page have the same timestamp (because one was uploaded with the same timestamp as a preexisting version), the later (imported) version will show up in the edit history but not in the article itself.
See mw:Manual:XML Import file manipulation in CSharp for an example of working with these XML files in Visual Studio .NET C#.
Merging histories and other complications
If the import includes history information, and the edits involved a user name which in the importing project is used by somebody else, then upload import should be applied, and the occurrences of the user name in the XML file should first be replaced by another name, to avoid ambiguity. If the user name was not used yet in the importing project then the user contributions are available anyway, although an account is not automatically created.
Just like when a page is referred to in a link, and/or put in a URL, generic namespace names are automatically converted, and if a prefix is not a namespace name the page will arrive in the main namespace. However, e.g. "Meta:" may be ignored (dropped) on a project that uses that prefix for interwiki linking. It may be desirable to change it in the XML file to "Project:" before importing.
If a page name exists already, importing revisions of a page with that name causes the page histories to be merged. Note that after inserting a revision between two existing revisions in the page history, the change made by the user who made the next edit seems different from what it actually has been: to see the actual change made by the user one has to take the diff between the two already existing revisions, not the diff with respect to the inserted one. Therefore this should not be done except to reconstruct the true page history.
A revision is not imported if a revision of the same date, and exactly the same time up to the second, exists already. In practice this occurs only when the revision has already been imported before, or when the revision one attempts to import was imported the other way around, or both were imported from a third site.
An edit summary may refer to, and possibly link to, another page. This may be confusing when the page has been imported but the target page has not.
The edit summary does not automatically show that the page has been imported, but in the case of upload import that can be added to the edit summaries in the XML file before importing. That can avoid some potential sources of ambiguity and/or confusion. When editing the XML file with find/replace, note that adding a text to the edit summaries requires distinguishing between edits which already have an edit summary, hence comment tags in the XML file, and those without these tags. If there are multiple pairs of comment tags, only the last one is effective.
User contributions
Without provisions for user name conflicts, the user contributions list shows:
- the edits by the person registered under the user name concerned on the project
- for each wiki from which pages have been imported, the edits of imported pages before import, by the user who on the source project has the user name concerned
If at the time of import the page did not exist yet on the target site, the two can be distinguished by comparing the time of import with the time of the edit.
If the user page and user talk page do not have a user contributions link in the page margin then the user is not registered, so all their edits are imported.
Large-scale transfer
For a large-scale transfer, somebody with sufficient system privileges can move data within the server, which is more practical than sending large XML files from the server to a user's local computer and then back to the server.
Large files may be rejected for two reasons. The PHP upload limit, found in PHP configuration file php.ini:
; Maximum allowed size for uploaded files. upload_max_filesize = 20M
And also the hidden variable limiting the size in the input form. Found in the mediawiki source code, includes/SpecialImport.php:
<input type='hidden' name='MAX_FILE_SIZE' value='20000000' />
Maybe you should change following four directives in php.ini:
; Maximum size of POST data that PHP will accept. post_max_size = 20M
max_execution_time = 1000 ; Maximum execution time of each script, in seconds max_input_time = 2000 ; Maximum amount of time each script may spend parsing request data
; Default timeout for socket based streams (seconds) default_socket_timeout = 2000
See also
- data dumps describes the maintenance script maintenance/importDump.php which provides an alternate import mechanism, but hasn't always remained in working order with recent MediaWiki releases