Help talk:Sources
How info should be displayed
[edit]The sources field is multiline. Now, we have to decide how to use it. Some options:
- Option 1
Use a standard citation format, for instance cited in:
stated in: item about some book author: Some Author publisher: Some Publisher etc. page: page
see [1], P231
- Oppose Author, publisher etc. are properties of a book and they belong in the item about the book. This solution would increase workload and create the potential for data consistency problems. Silver hr (talk) 02:09, 29 March 2013 (UTC)
- So for each book you want to have an item ? Do you have an idea of the number of new items we will have to create ? Then for a book like "The lord of the ring" we can have hundreds of items because of different publisher, different languages, different editions,... Snipre (talk) 14:43, 7 April 2013 (UTC)
- Can you describe what exactly you think is the problem? The number of items, which you think will be too large? WD is a computer database, it's not constrained by space (practically), therefore I don't consider "too many items" to be a valid argument in itself. Besides, if all the component statements for a source statement would have to be entered every time anyway, it really is no different to instead enter them into an item and then use the item in the source statement. Actually, that's even less work in cases where you have several source statements that are the same--instead of entering author, publisher etc. every time per source statement, you enter it once into an item and then you only have to enter the item into the source statements. Silver hr (talk) 17:36, 12 April 2013 (UTC)
- So for each book you want to have an item ? Do you have an idea of the number of new items we will have to create ? Then for a book like "The lord of the ring" we can have hundreds of items because of different publisher, different languages, different editions,... Snipre (talk) 14:43, 7 April 2013 (UTC)
- Option 2
Try to store as much info in the cited items themselves, for instance by creating one item for each edition of the book, and only add the part that cannot be stored there:
stated in: item about a particular edition of a book page: page
- For option 2 it is important to know how data will be accessible in wikipedia article: if data of one article is "downloaded" when somebody open the wikipedia article, that's a problem to save part of the references on a different item of the wikidata database. Snipre (talk) 01:46, 27 March 2013 (UTC)
- The inclusions states that it will be possible to query data from other items. I imagine we could have a template that would expand the reference based on the reference ID. --Zolo (talk) 18:23, 28 March 2013 (UTC)
- Support I think both option 1 and 2 will work fine, but option 1 will create many redundant data, and the maintenance will be difficult. If I want to add sources to the "member state" property of United Nations, I certainly don't want to add every source properties (author, publisher, edition, ISBN...) for nearly 200 times.
- I think this option is good for books, but I am not sure how to deal with other kind of sources (I suppose we don't want to create items for every single webpages) --Stevenliuyi (talk) 20:56, 28 March 2013 (UTC)
- If we do not have an item for the resource, then it would not really be option 1 either. I would probably support creating an item for every cited peer-reviewed scholarly article, but I suppose that there will always be cases were we have to hardcode the full reference in the sources field. --Zolo (talk) 22:09, 28 March 2013 (UTC)
- @Stevenliuyi stop thinking of manual edition with wikidata: bots can do the work. Snipre (talk) 01:32, 29 March 2013 (UTC)
- If we do not have an item for the resource, then it would not really be option 1 either. I would probably support creating an item for every cited peer-reviewed scholarly article, but I suppose that there will always be cases were we have to hardcode the full reference in the sources field. --Zolo (talk) 22:09, 28 March 2013 (UTC)
Option 3
Note use "stated in some item", but rather an identifier like ISBN or DOI, when possible
ISBN ISBN of the book page: page
See [2], P180 (depicts), Leonardo
- Comment How does wikipedia articles get data (title, author, etc.) from single ISBN/DOI? --Stevenliuyi (talk) 20:56, 28 March 2013 (UTC)
- Er, yes, that does not sound practical. If that is possible though, it might be nice allow humans to use it and devise a bot to convert it to format 2. --Zolo (talk) 22:09, 28 March 2013 (UTC)
Citing databases
[edit]- Option 1: use a general ID property:
stated in: item about the database ID: string containing the relevant ID in the database
- Option 2: use a database specific property when available, a general ID property otherwise (see [3], P88):
stated in: item about the database database specific property: string containing the relevant ID in the database
- CommentIt makes it easy to automatically generate external links. --Zolo (talk) 14:55, 26 March 2013 (UTC)
- Option 3: when possible, just the database specific property, with no header like "stated in" (see [4], P195)
database-specific property: string
- Comment That is clear and simple, but we seem to need the "stated in" header for books, and not using it for databases, may introduce some confusing inconsistency.--Zolo (talk) 14:55, 26 March 2013 (UTC)
- I don't think there would be confusing inconsistency. The "source" field for a statement already has the same meaning as the "stated in" property. The only reason we have to use "stated in" is because the design of the system is such that we have to use some property. In this case there would be a property that had a more specific meaning than "stated in", and one should generally use specific properties instead of their general equivalents. Silver hr (talk) 04:43, 29 March 2013 (UTC)
- Option 4: when possible, just the database specific property that has an item data type. The corresponding item contains the database specific property as a string and the necessary data (author etc.) from the database.
database-specific property: item about the reference
- Comment If we want to display the reference data and we don't have a way to dynamically pull it from external databases, then I think this is the only choice. Also, I just realized that if the data type of the property would be item, then the property could just be "stated in", not database-specific. Silver hr (talk) 04:43, 29 March 2013 (UTC)
- By "database-specific" I meant specific to one database, not the same property for all databases. I think we should always provide the ID of the database entry, as it may not always be the one specifically about the item. --Zolo (talk) 08:19, 29 March 2013 (UTC)
- Yes, I assumed that :). What I meant was the following; I'll give an example. Suppose we had a property "PMID" with the item datatype, and that property could only link to items that have, besides the actual article data, a "PMID string" property which would hold the actual PMID. And then suppose we had the same thing for DOI. Then, the properties "PMID" and "DOI" could simply be replaced by "stated in". So, if we have "stated in: Q#" and we wanted to know what kind of source Q# is, we could query it for a "PMID string" or "DOI string" property and read the value. Silver hr (talk) 22:12, 30 March 2013 (UTC)
- I am not sure to follow you: how would that allow you to say "the reelin gene is fully sequence according to Uniprot entry P78509" ? Or are you suggesting to create an item for every database entry ? --Zolo (talk) 18:01, 3 April 2013 (UTC)
- You bring up an interesting point. I was thinking in terms of citation databases, which contain articles/article metadata, so the item created would represent the article and contain its metadata because the metadata is needed for display. For non-citation databases, such as your example, option 3 would probably be better. Silver hr (talk) 18:17, 12 April 2013 (UTC)
- I am not sure to follow you: how would that allow you to say "the reelin gene is fully sequence according to Uniprot entry P78509" ? Or are you suggesting to create an item for every database entry ? --Zolo (talk) 18:01, 3 April 2013 (UTC)
- Yes, I assumed that :). What I meant was the following; I'll give an example. Suppose we had a property "PMID" with the item datatype, and that property could only link to items that have, besides the actual article data, a "PMID string" property which would hold the actual PMID. And then suppose we had the same thing for DOI. Then, the properties "PMID" and "DOI" could simply be replaced by "stated in". So, if we have "stated in: Q#" and we wanted to know what kind of source Q# is, we could query it for a "PMID string" or "DOI string" property and read the value. Silver hr (talk) 22:12, 30 March 2013 (UTC)
- By "database-specific" I meant specific to one database, not the same property for all databases. I think we should always provide the ID of the database entry, as it may not always be the one specifically about the item. --Zolo (talk) 08:19, 29 March 2013 (UTC)
- I need to cite a lot of different databases for the Stratigraphy Task Force. Probably one per country (or geologic survey). I would favor Option 2, and was wondering if somebody with more experience could write a guideline for database citing on this page. --Tobias1984 (talk) 05:48, 13 May 2013 (UTC)
Some general questions
[edit]Will/should Wikidata support Wikipedia references?Will/should Wikipedia use Wikidata to store and display references instead of the system in use presently? I've always assumed so, though I haven't seen a specific official statement or community consensus to that effect. Personally I think it should; after all references are just another kind of data.- How much should Wikidata rely on external databases? This question is quite relevant here because it seems to me that pretty much every reference item is in some external database and has an identifier through which it can be referenced. On the one hand, WD aims to be an all-encompassing database so why not? But on the other hand, that would essentially be data duplication, which should generally be avoided. But the key question here seems to be: can we automatically extract reference elements (author etc.) from references in external databases? If not, then obviously the data will have to be stored in Wikidata. On a related note, this is also the approach Wikipedia takes; besides the actual embedding of references in article text, there is this - over 30k references stored as individual template subpages for doi alone (done by Citation bot).
Silver hr (talk) 04:11, 29 March 2013 (UTC)
- What do you mean exactly by "Will/should Wikidata support Wikipedia references?"? As for the second question, it will certainly have to cite many external databases (e.g. a census bureau database to get population data, among many others). But I think the goal is to build a free self-contained database. Superm401 (talk) 06:39, 1 April 2013 (UTC)
- On the technical side, I do not think we have anything special for citing Wikipedia. In practice, bots have added many statements with the somewhat vague referrence: "imported from Wikipedia in English". For the few discussions I have seen on the issue, using Wikipedia as a referrence is only a temporary solution. To me, that makes sense. Wikipedia is usually a very indirect source, and a rather instable one at that. Wikipedia is supposed to have external sources for just about anything it claims, and using this external source here is more informative than just saying "Wikipedia says".
- There is no plan to directly transclude data from an external database to Wikidata. As it is much easier to import data from databases than from texts, I would imagine it makes sense to import much of them (it is easier to deal with one database than with several). One thing that I dont know however, is how to deal with databases that use special controlled vocabularies. That's one of the questions around P:P107: what should we do if the definition for person is different in an external database than in the main Wikidata item for person ? --Zolo (talk) 15:20, 2 April 2013 (UTC)
- If there is no plan to transclude data from external DBs to WD, then the logical conclusion is that such data needs to be imported. Which leads us to the question: which data is that? At the very least, any data that needs to be displayed (such as article/book citations: title, author etc.). As for everything else, I don't know.
- Regarding your question, could you provide an example problem? I'm not sure I understand exactly what you mean.
- Silver hr (talk) 18:52, 12 April 2013 (UTC)
- In non-scientifical fields, words are often loosely defined, so that some databases have to rely on home-made controlled vocabularies. Take the Mérimée database for French listed buildings that is extensively used in fr.wikipedia. It classifies building by "building type". Building types form a tree that I once copied there. However, this tree is certainly not the only possible one, and Wikidata will most probably use a different system. That will "fell" the tree. At the same time, we may imagine that the people who made the database knew their stuff, and that their system has some relevance. In this case, should we add a "Mérimée type" to keep the Mérimée hierarchy safe ? --Zolo (talk) 13:33, 15 April 2013 (UTC)
- Well, adopting any particular controlled vocabulary could be construed as POV. So I guess the solution would be to allow either none of them or any of them if they're notable enough. As for what's notable enough, I don't know.
- Also, I assume you meant 'add a "Mérimée type"' as a property, which is certainly one way of doing it. But it just crossed my mind that it might also work by using an "instance of"/"subclass of" statement with an appropriate source, which means there would be a multitude of such statements with different sources, thus negating the need for vocabulary-specific properties. I'm not sure which solution would be better though. (BTW there is no page at the link you provided, but I think I understood you anyway :) ).
- Silver hr (talk) 23:41, 18 April 2013 (UTC)
- In non-scientifical fields, words are often loosely defined, so that some databases have to rely on home-made controlled vocabularies. Take the Mérimée database for French listed buildings that is extensively used in fr.wikipedia. It classifies building by "building type". Building types form a tree that I once copied there. However, this tree is certainly not the only possible one, and Wikidata will most probably use a different system. That will "fell" the tree. At the same time, we may imagine that the people who made the database knew their stuff, and that their system has some relevance. In this case, should we add a "Mérimée type" to keep the Mérimée hierarchy safe ? --Zolo (talk) 13:33, 15 April 2013 (UTC)
@no.2: We should create the items we need, see: Wikidata talk:Notability#Main types (GND). --Kolja21 (talk) 19:47, 3 April 2013 (UTC)
- 1. Wikidata has to repect wikipedia rule about sourced information. without reference no wikipedia should use data from wikidata.
- External databases has to be cited and data have to be duplicated and integrated in wikidata as wikidata will be the database for data inclusion in wikipedia. Then I think you won't be able to extract most of the data from external databases because authority databases are most of the time copyrighted and not compatible with wikidata license. Snipre (talk) 15:18, 7 April 2013 (UTC)
- I'm sorry, I see now I haven't been clear in my first question, so I'll rephrase (and edit the original): Will/should Wikipedia use Wikidata to store and display references instead of the system in use presently? Silver hr (talk) 18:52, 12 April 2013 (UTC)
How do we cite references in wikidata ?
[edit]We have to define a policy for reference in wikidata. Here we have some issue to solve in order to provide a policy:
If no item ?
[edit]Right now most item relies to a wikipedia article so properties like author or title are item datatype. But for reference most of books or authors don't have a wikipedia article and therefore an item in wikidata. It is not possible to have a property with 2 different datatypes and working with 2 properties (one string datatype and another item datatype) for the same parameter is not an possibility if we want to create an homogenuous work to treat data (query, list,...).
So for references there are two possibilities: define an item property and we have to create an item each time it is needed (one item for each title and each author: millions of potential new items just for references purposes and in most of the cases these items will be used only once). We have to find a solution for books but we have to think at scientific and newspaper articles too: an unique solution is necessary. Snipre (talk) 15:04, 7 April 2013 (UTC)
- I would add that IRI is important. I realize that web references are sometimes discouraged (e.g. due to linkrot), but in some cases all or almost all the reliable sources are online. Superm401 - Talk 00:17, 13 April 2013 (UTC)
If item exists ?
[edit]If an item exists (or if we decide to create for each title an item) we have to think how we will formally add reference data: do we want to have item for each specific book defined by edition, publisher and ISBN ? In that case a book like "The lord of the ring" can have several dozens of items. Or do we want to integrate data of different books in the same item ? But in that case how to refer to a specific book described in the item page ? Or do we want to work with mixed properties: the title and author properties are defined in a main item and other properties are added in the reference section of the item we want to source ? Snipre (talk) 15:04, 7 April 2013 (UTC)
A possible solution: The Open Annotation Data Model
[edit]The Open Annotation Data Model is composed by three elements: body, annotation and target. A body would be an article represented by an item (Qxxx), the annotation/reference could take the form Rxxxx, and whenever we have a primary source (i.e. a book, article in Wikisource) it could take the form Sxxx. It could be also done with properties, however it would add an additional layer of cognitive complexity (i.e. it would not be that easy to spot on the first sight the function of each). The OA model simplifies the organization and provides an intuitive logic structure. Regarding the "The lord of the ring" example, if the OA model is used and there is a reference between bodies which are "works" (in the FRBR sense), then the annotation can hold the data about the "manifestation" (edition, isbn, page nr, etc). There is also an ongoing discussion about the relationship between "works" and "manifestations" that I suggest following at the Wikidata:Books task force.--Micru (talk) 04:58, 13 April 2013 (UTC)
- Sorry but this implies an important development from WD development team so that is no the objectives of this discussion: we need to define a policy with the existing tools in order to allow a correct sourcing of data now and not at the end of the year. Snipre (talk) 08:55, 13 April 2013 (UTC)
- Not necessarily needs more development. It can be done with properties and leave it like that, or migrated later on.--Micru (talk) 17:32, 13 April 2013 (UTC)
- Conceptually, that sounds like a good solution, but I do not think that using a different namespaces would be a very good idea. There are items for a huge varieties of things, and we should certainly be able to differentiate between them. But we cannot use namespaces for that, as there could be so many of them (actually, if we had just two of namespaces, I would think it should be one for items that should contain real data and another for those that are just hubs for sitelinks, but I suspect that using always using properties is the one really scalable solution). --Zolo (talk) 10:39, 16 April 2013 (UTC)
- Not necessarily needs more development. It can be done with properties and leave it like that, or migrated later on.--Micru (talk) 17:32, 13 April 2013 (UTC)
Specific properties for reference ?
[edit]I think that the best solution is finally to have a set of properties used only for reference purpose and which are only string datatype: for title, for author,... Snipre (talk) 09:25, 16 April 2013 (UTC)
- I do not quite see why you are proposing that. I think we clearly need a string-type property for titles, but that could also be useful outside references. Using different properties at different places would be confusing at best. Using strings for authors and publishers would be a step backward in terms of consistency, machine readability and "internationalizability" (think of Japanese people!). Even for minor authors, an item with just a name (and ideally a VIAF link) would be better than a string. --Zolo (talk) 10:45, 16 April 2013 (UTC)
- If we need string property for title we need string property for author and working with two datatypes for the same property is not possible for inclusion or query: we hwvae to double every query to check both properties when looking for one value. By using only string properties for references we know what properties to query when looking for information. Snipre (talk) 13:08, 16 April 2013 (UTC)
- And if a book is published only in Japanese it is not right to translate it without the agreement of the author. What's happen is someone is doing a wrong translation ? Only authorized persons like authors, publishers can do a translation. If newspapers or other medias are doing translation we can use it but only by putting the reference, but no contributors can translate from is own decision an original title. By source we can only use information of a real document and the document is not translated no translation is allowed. If you are using a Japanese version of a book to source a statement you have to put the Japanese title even if there is an english version of the book. You can not mix informations Snipre (talk) 13:26, 16 April 2013 (UTC)
- I am not proposing to translate titles. That is precisely why it should be a string. As of now, do not have any property for the title. We use the label instead, but the label is not a property, and as I have shown elsewhere, there are many cases where it will not be the same as the title. Regardless, titles, author names and publisher names should be transliterated somewhere, as is standard practice. Otherwise things will be very hard to read (contrary to translations, transliterations are routinely done for convenience). We could use labels for that, but it would require dedicated items. --Zolo (talk) 15:06, 16 April 2013 (UTC)
New RFC about references and sources in Wikidata
[edit]Following the discussions in the Books task force and here, I have started a new RFC about sources and references in Wikidata to summarize the different options and to gather feedback from the community.--Micru (talk) 22:40, 16 April 2013 (UTC)
Volume
[edit]The parameter Volume in the table is really two different things for books and scientific articles (meaning 4 and 3 in wikt:volume#Noun) The English language happens to use the same word for these two things, but other languages don't. I think they should be handled as different parameters. Byrial (talk) 15:54, 18 April 2013 (UTC)
- Even if there are different nouns the concept is the same: it is a subdivision of a work. But we can think if there is real problems: if it is just an extension of a concept, there is still the possibility to explain that in the description place. Snipre (talk) 17:07, 18 April 2013 (UTC)
Laws
[edit]How to insert a reference to laws or judgments? --β16 - (talk) 10:46, 3 June 2013 (UTC)
- What kind of parameters do you have for a law article ? And what kind of data do you want to source ? Snipre (talk) 11:14, 3 June 2013 (UTC)
- The parameters can be: type (in Italy for example there are: Legge, Decreto, Regio Decreto, and others), date, number, and maybe title. It can be used for source of variation or creation of new adminstrative division (for example "legge 11 giugno 2004 n. 147" for P571 in Q16167) or for the granting of flags and coats of arms. --β16 - (talk) 12:37, 3 June 2013 (UTC)
- Can't you use the offical journal of the republic instead of the law ? The proble with law is to find a common system to cite all national laws of every country. So if you really want to use legal texts please start to compare the different legal structures and to find a common structure. From my point of view this will lead to the same problem as for the administrative divisions. Snipre (talk) 16:52, 3 June 2013 (UTC)
- According to w:en:Category:Legal citation and w:en:Category:Law citation templates a common structure that can be used consist of: State, title (or short title, or name), type, date\year, number, optionally subsection (or paragraph). --β16 - (talk) 13:57, 4 June 2013 (UTC)
- Can't you use the offical journal of the republic instead of the law ? The proble with law is to find a common system to cite all national laws of every country. So if you really want to use legal texts please start to compare the different legal structures and to find a common structure. From my point of view this will lead to the same problem as for the administrative divisions. Snipre (talk) 16:52, 3 June 2013 (UTC)
- The parameters can be: type (in Italy for example there are: Legge, Decreto, Regio Decreto, and others), date, number, and maybe title. It can be used for source of variation or creation of new adminstrative division (for example "legge 11 giugno 2004 n. 147" for P571 in Q16167) or for the granting of flags and coats of arms. --β16 - (talk) 12:37, 3 June 2013 (UTC)
Web pages
[edit]Is there a temporary workaround for Web pages? The section is currently unhelpful, asking not to create items for them, but to use something currently impossible. --AVRS (talk) 10:56, 9 August 2013 (UTC)
- Without the url datatype there is no temporary solution and we won't propose a temporary solution because once we start with one system it is very difficult to modify the habits. Snipre (talk) 11:18, 9 August 2013 (UTC)
Scientific article and author (P50)
[edit]Hi to all! I have a (small) problem with the authors of this scientific article: the two authors have not item in Wikidata. What property do I have to use? Thanks! --Paperoastro (talk) 20:04, 11 August 2013 (UTC)
- Because the authors can be verified you can create items for them. I created Peter Philip Eggleton (Q14559451) for you. I pasted the website where I found his information in the description so it can be transferred once we can cite web pages. In the mean time some of the information about his person can also be cited with the publication, because it is also a source for him being a person, and for his occupation. It would still be good to find his middle name if possible. --Tobias1984 (talk) 20:55, 11 August 2013 (UTC)
- Ok, thank you very much! :-) --Paperoastro (talk) 22:43, 11 August 2013 (UTC)
Databases
[edit]In Kim Amb (Q3357572), I made some experiments with Sources, based on this edit by sv:User:Elinnea. The instructions tells me to "Add the database ID property to refer to a specific register of the database". The database in this case is DVD- or CD-ROM-based, and cannot be linked with an uri. The identifier in the database is here "Amb, Kim Oscar". I do not think it is a good idea to create a separate Property for this database. I thought of P357 (P357), but it's description tells "title of work"!. This can better be compared with the title of a specific article in an encyclopedia or dictionary. -- Lavallentalk(block) 14:08, 18 August 2013 (UTC)
- In that case better use the media section. If you don't want to create an item for that CD-ROM use "title" instead of "stated by". But as the sweden WP has a template for this reference perhaps it is a good idea to creat an item so just check if it is widely used and if yes create an item for the database and then you can use the "stated by" Snipre (talk) 16:35, 18 August 2013 (UTC)
- I modify the guidelines to cover all cases for databases. Snipre (talk) 16:43, 18 August 2013 (UTC)
- An item for the CD-ROM was already created in this case. I linked it to a redirect to an article about the series of this database. (Not all parts have the same publisher.) But how do I add the information: "Amb, Kim Oscar"? -- Lavallentalk(block) 17:22, 18 August 2013 (UTC)
- You can use catalog code (P528): I started the discussion to extend the application field of the property. Snipre (talk) 18:01, 18 August 2013 (UTC)
- An item for the CD-ROM was already created in this case. I linked it to a redirect to an article about the series of this database. (Not all parts have the same publisher.) But how do I add the information: "Amb, Kim Oscar"? -- Lavallentalk(block) 17:22, 18 August 2013 (UTC)
- I modify the guidelines to cover all cases for databases. Snipre (talk) 16:43, 18 August 2013 (UTC)
Generation of references based on source properties
[edit]For verification and as a use case, we created a LUA module which resembles the Taxobox, see Module:Taxobox. It also shows references for the used data. While not meant as being complete, it might be a starting point for someone who wants to create a general LUA module to handle source properties and generate the corresponding references. — Felix Reimann (talk) 08:15, 2 September 2013 (UTC)
General strategy for data extraction in WP
[edit]First a lua module is necessary. Then the extraction has to work in 3 different levels:
- Level 1: data available in the source section of the claim.
- Level 2: specific item defined by the property "stated in"
- Level 3: For book, item defined as work item or author item
Snipre (talk) 01:58, 11 September 2013 (UTC)
Web page
[edit]If I have understood correctly, I should add this 2500 times? Since the GUI s***s, it's a h**l of a job! -- Lavallentalk(block) 08:32, 11 September 2013 (UTC)
- For webpage, yes. If you take the example of a book, webpage is a like a page and creating an item for each page you use as reference will definitively explose the number of items. I can only recommand you to use a bot to perform this kind of addition. Try to see if a bot can do that, if not contact me, I am not a specialist but I have a bot for this kind of addition. See user:Chembot. Snipre (talk) 10:11, 11 September 2013 (UTC)
- This is the piece of information that will connect the Wikidata-item with the corresponding item in the database of Statistics Sweden, that can hardly be done by a bot in a safe way. -- Lavallentalk(block) 11:01, 11 September 2013 (UTC)
- But you can do that in an excel sheet with data and match controls and then with a bot add the data through the API. What you need is just the matching of the Q-number and the code. Or you can do manually the import of the code and once you finish you provide the list of items you modified and the bot can do only the reference addition. Snipre (talk) 11:11, 11 September 2013 (UTC)
- If you are going to use it so often, you should really make a new item. Then you just use stated in (P248) in the source. --Tobias1984 (talk) 11:17, 11 September 2013 (UTC)
- @Tobias1984. I don't think we can propose that solution because at the end who is setting the limit ? And then for data extraction this will imply more tests. The code will needs to check there is an item for the webpage or not. So I prefer to say no but to propose automatic ways to add x times the same kind of data. We have bots for that. Snipre (talk) 12:31, 11 September 2013 (UTC)
- That is of course true what your saying and I don't remember if the previous RfC had something about limits (I'm thinking of the number of links to a certain item). The only thing that I don't think is a problem is the number of items. Google-Books has estimated that there are 130 million unique books en:Google Books. For us (13 million at the moment) that would only mean one order of magnitude more items. Adding government reports the number might climb to 150 or 200 million items, but in a few years that will hopefully be manageable by Wikidata. In any case the community can now gather experience with sourcing and hopefully we can soon find some kind of consensus on how to handle these things. --Tobias1984 (talk) 16:31, 11 September 2013 (UTC)
- What worries me more is author (P50). For Sveriges kommunindelning 1863–1993 (Q14849547) I had to create: Per Andersson (Q14849601). But the only thing I know about him is that he has written this book and that he is born 1961. There are so many "Per Andersson" that it's impossible to identify if there are any other objects with this title, who have the same name and is born 1961. I'm glad that it didn't was P. Andresson, then I would noy even know P21. -- Lavallentalk(block) 16:39, 11 September 2013 (UTC)
- Searched on VIAF and found him :) Per Andersson --Tobias1984 (talk) 16:47, 11 September 2013 (UTC)
- I'm impressed! -- Lavallentalk(block) 17:10, 11 September 2013 (UTC)
- Searched on VIAF and found him :) Per Andersson --Tobias1984 (talk) 16:47, 11 September 2013 (UTC)
- What worries me more is author (P50). For Sveriges kommunindelning 1863–1993 (Q14849547) I had to create: Per Andersson (Q14849601). But the only thing I know about him is that he has written this book and that he is born 1961. There are so many "Per Andersson" that it's impossible to identify if there are any other objects with this title, who have the same name and is born 1961. I'm glad that it didn't was P. Andresson, then I would noy even know P21. -- Lavallentalk(block) 16:39, 11 September 2013 (UTC)
- That is of course true what your saying and I don't remember if the previous RfC had something about limits (I'm thinking of the number of links to a certain item). The only thing that I don't think is a problem is the number of items. Google-Books has estimated that there are 130 million unique books en:Google Books. For us (13 million at the moment) that would only mean one order of magnitude more items. Adding government reports the number might climb to 150 or 200 million items, but in a few years that will hopefully be manageable by Wikidata. In any case the community can now gather experience with sourcing and hopefully we can soon find some kind of consensus on how to handle these things. --Tobias1984 (talk) 16:31, 11 September 2013 (UTC)
- @Tobias1984. I don't think we can propose that solution because at the end who is setting the limit ? And then for data extraction this will imply more tests. The code will needs to check there is an item for the webpage or not. So I prefer to say no but to propose automatic ways to add x times the same kind of data. We have bots for that. Snipre (talk) 12:31, 11 September 2013 (UTC)
- If you are going to use it so often, you should really make a new item. Then you just use stated in (P248) in the source. --Tobias1984 (talk) 11:17, 11 September 2013 (UTC)
- But you can do that in an excel sheet with data and match controls and then with a bot add the data through the API. What you need is just the matching of the Q-number and the code. Or you can do manually the import of the code and once you finish you provide the list of items you modified and the bot can do only the reference addition. Snipre (talk) 11:11, 11 September 2013 (UTC)
- This is the piece of information that will connect the Wikidata-item with the corresponding item in the database of Statistics Sweden, that can hardly be done by a bot in a safe way. -- Lavallentalk(block) 11:01, 11 September 2013 (UTC)
original title in the original language
[edit]How to handle cases, when the source is in two or more languages? This far, I have added two "languages" and two "titles". -- Lavallen (talk) 16:38, 13 September 2013 (UTC)
- Can you provide a link to your example ? Snipre (talk) 21:44, 25 September 2013 (UTC)
- Did you check both language editions or did you check one and assume the other was the same? You should really only link to the edition you used as a source. Filceolaire (talk) 23:00, 25 September 2013 (UTC)
- It's not two editions, it's one report written partly in Swedish, partly in English. Localities 2010 (Q14907217) is an example. The description in the report is in two languages, the title is in two languages. More or less all official reports from SCB looks like this, but those reports who are ~100 years old are written in French instead of English. If there would have been two editions, they would most likly have separate ISSN. -- Lavallen (talk) 07:34, 26 September 2013 (UTC)
- Did you check both language editions or did you check one and assume the other was the same? You should really only link to the edition you used as a source. Filceolaire (talk) 23:00, 25 September 2013 (UTC)
Who is responsible for this change? Is there a bot which exchanges the properties? Then I would like to add Integrated Taxonomic Information System (Q82575) and World Odonata List (Q13561342) to the queue. With What links here you find about 8000 items which use P585 in ~4 source claims per item which can be moved to P813. — Felix Reimann (talk) 19:13, 25 September 2013 (UTC)
- There are two changes: see here and here. Snipre (talk) 21:26, 25 September 2013 (UTC)
- Thanks for the hint.
- The information in World Odonata List (Q13561342) must now be distributed to ~24000 claims. What if we would have a Wikipedia article for exactly this web page (World Odonata List) - not the publisher/hoster?
- What if a database defines publication date (P577)? Should retrieved (P813) be omitted then as defined for web pages? Example: [5]. ITIS defines for each database entry the modification data. I used this as input for publication date (P577). — Felix Reimann (talk) 07:34, 26 September 2013 (UTC)
- The important information is publication date (P577) because it allows to expalin differences in a same web page over a certain period. When no publication date (P577) is available then retrieved (P813) can be used. If you have publication date (P577) retrieved (P813) becomes useless so it is not forbidden to add this information but it is not interesting: we don't require to provide the date when a contributor use a book for sourcing.
- So my proposition for your questions: 1) if we start with some page, it would be difficile to manage item creation for webpage, 2) Same as for webpage. But this is only my opinion. Snipre (talk) 19:54, 27 September 2013 (UTC)
- Thanks for the hint.
Further reading
[edit]How do I add general literature about a person or a topic. Is there a property like "further reading"? --93.197.101.45 22:12, 1 November 2013 (UTC)
- Wikidata is not wikipedia: wikidata is working at the moment only with facts and their sources. So if you add information for a person you can add the corresponding sources. There are discussions about using wikidata as place to store literature data for wikipedia support but this is not yet accepted as notability policy. Snipre (talk) 00:12, 2 November 2013 (UTC)
- For that to happen we'd need to support first bibliographic information from Wikipedia in Wikidata, and then checking which books are linked from a certain article. This has general acceptance, but the software it is not ready for this yet (mainly because of Bugzilla: 47930). With those book records you should be able to use the property main subject (P921) which would help to make faceted searches. In the future it might be feasible to replace the "further reading" lists with autogenerated lists, for instance replace w:Benjamin Franklin#Biographies with a search "main topic of creative work:'Benjamin Franklin' AND genre:'biography'", however we are still far from that.--Micru (talk) 13:05, 2 November 2013 (UTC)
Conference paper
[edit]We should have a recommendation for citing conference papers, such as https://www.lpi.usra.edu/meetings/lpsc2013/pdf/1576.pdf . Superm401 - Talk 02:30, 24 November 2013 (UTC)
To Source or Not To Source
[edit]I tried my best recently to better source Douglas Adams (Q42), that is one of our Showcase items, since it bore a large number of "Imported from WP" statement sources. This item bears statements pointing to no less than 19 different authority control sources; if I was to follow Help:Sources/Items not needing sources#When the item has a property referring to an external source, there would barely be any statement source at all for that item... I preferred using stated in (P248) as suggested in Help:Sources#Trusted database. Do I misinterpret Items not needing sources ? LaddΩ chat ;) 20:01, 8 December 2013 (UTC)
- The link to an authority (in this example: "Douglas Adams (Q42) VIAF ID (P214) 113230702") does not require a source but every other statement should be sourced. If not, how should one know which statements are supported by which trusted authorities? -- JakobVoss (talk) 22:31, 16 December 2013 (UTC)
Source for a Commons image
[edit]Douglas Adams (Q42) uses image from Commons. Does this require any kind of sourcing info? It's already pretty clear on Commons, after clicking on the image. LaddΩ chat ;) 20:11, 8 December 2013 (UTC)
- Three possible answers - may other editors add their opinion:
- 1) this is one of the rare cases were Wikipedia is a viable source. The source of an image statement should be a document that proves the usage of this specific image as portrait.
- 2) aLternatively you could omit the source, so you as editor (It's already pretty clear to you) are the source of link between Douglas Adams (Q42) and the image.
- 3) The image source is www.hughes-photography.eu, as stated in Commons. Commons templates should be moved to Wikidata, so the image would be an item with source information attached to it. -- JakobVoss (talk) 22:37, 16 December 2013 (UTC)
Web sources referenced many times
[edit]How can we store a web source with many properties which is used many times, like a cite web source template, such as en:Template:Ru-pop-ref? --JulesWinnfield-hu (talk) 14:28, 12 February 2014 (UTC)
- You can add several properties in one source in several places. See how I have used this web page. -- Lavallen (talk) 15:46, 12 February 2014 (UTC)
- Yes I know. This is not the point. The point is, that I need to this (the same thing) several times. It is not productive, not desirable, and it might be inconsistent. See [6]. Should I do this for all Russian cities? --JulesWinnfield-hu (talk) 16:30, 12 February 2014 (UTC)
- I have done the above 1700 times, so I know what you mean. But I guess you can treat "Всероссийская перепись населения 2010 года. Том 1" as a report, like I have done with this Swedish census from 2010. -- Lavallen (talk) 16:39, 12 February 2014 (UTC)
- We need some kind of gadget for this. Nobody is going to source things as long as they have to fill in the same 5 field over and over again. --Tobias1984 (talk) 19:47, 12 February 2014 (UTC)
- Yes, but I am not sure if a normal gadget would help. Using something like labellister 3000 times does not make it easier than doing it by hand. We would need somthing like a script that is easy to modify to my own needs. -- Lavallen (.talk) 06:28, 13 February 2014 (UTC)
- We need some kind of gadget for this. Nobody is going to source things as long as they have to fill in the same 5 field over and over again. --Tobias1984 (talk) 19:47, 12 February 2014 (UTC)
- I have done the above 1700 times, so I know what you mean. But I guess you can treat "Всероссийская перепись населения 2010 года. Том 1" as a report, like I have done with this Swedish census from 2010. -- Lavallen (talk) 16:39, 12 February 2014 (UTC)
- Yes I know. This is not the point. The point is, that I need to this (the same thing) several times. It is not productive, not desirable, and it might be inconsistent. See [6]. Should I do this for all Russian cities? --JulesWinnfield-hu (talk) 16:30, 12 February 2014 (UTC)
- And what's about using bot to do that job ? If you have a certain number of values from a source better prepare the data in a specific form and ask a bot to do the import job. Snipre (talk) 12:10, 14 February 2014 (UTC)
- I added a feature request for Magnus' AutoList tool. I think, this could be beneficial for all who do not own a bot. — Felix Reimann (talk) 12:26, 14 February 2014 (UTC)
- Nobody needs to have its own bot: just provide the data in the good format to a bot and put the request in the appropriate page. Snipre (talk) 18:40, 15 February 2014 (UTC)
- And I have added a similar request here --ValterVB (talk) 20:07, 14 February 2014 (UTC)
- I added a feature request for Magnus' AutoList tool. I think, this could be beneficial for all who do not own a bot. — Felix Reimann (talk) 12:26, 14 February 2014 (UTC)
Articles in books
[edit]It seems that we do not have guidelines for articles that are part of a collective book. Is Human Evolution (Q15864332), used in human (Q5)'s temporal range start (P523), ok ? I am not really with the "part of" but that's mostly because I am not really happy with using "part of" in source items in the first place. --Zolo (talk) 07:19, 2 March 2014 (UTC)
- I don't see any difference between articles that have been published as part of a book and articles that have been published in a journal. Help:Sources can be used for both. What is missing in your citation is page(s) (P304). A bit tricky is the destinction between works and editions. Right now we have only the 2009 edition of Evolution: The First Four Billion Years (Q15864345). But what happens if there will be a work item for this book? --Kolja21 (talk) 15:25, 2 March 2014 (UTC)
Mock-up of a Wikidata {{Cite}}
[edit]If someone is interested: I created a prototype LUA module for a Wikidata-based {{Cite}}: Use it with {{#invoke:Cite|cite|qid=Q14405740}} to create the citation for Q14405740. This will result in:
Please use this very sparsely and only for model testing as it uses costly Javascript methods to retrieve the data. As soon as arbitrary item access is enabled by the developers, we can use Wikibase client functions and the same code will execute with high performance. Then, we may use it everywhere. If you want to speed this up, vote for bug 47930. Regarding the status of the module itself: Feedback and help is very welcome, much is still missing. — Felix Reimann (talk) 15:23, 24 April 2014 (UTC)
- Very good idea. I like it very much. Unfortunately we have to wait until the defect 47930 is resolved. I think it would be simpler if the qid parameter is passed as first indexed parameter to avoid the qid=... and reduce the call to {{CiteQ|14405740}} or something if the #invoke is wrapped in a template. Paweł Ziemian (talk) 19:30, 24 April 2014 (UTC)
- The second wish is for {{#invoke:Cite|sfn|qid=Q14405740}}, to generate reference in form similar to produced by en:Template:Sfn. This shall be easy as the Qid might be used as automatic unique internal anchor identifier. Paweł Ziemian (talk) 19:48, 24 April 2014 (UTC)
- Thank you for your feedback! Here is your wish 1: Template:CiteQ. — Felix Reimann (talk) 20:23, 24 April 2014 (UTC)
- The second wish is for {{#invoke:Cite|sfn|qid=Q14405740}}, to generate reference in form similar to produced by en:Template:Sfn. This shall be easy as the Qid might be used as automatic unique internal anchor identifier. Paweł Ziemian (talk) 19:48, 24 April 2014 (UTC)
Updates as part of documentation overhaul
[edit]Hi all,
I recently made substantial edits to Help:Sources as part of a larger sitewide documentation overhaul (more info on this here).
To compare my edits with the previous version please see the diffs here
Major changes include the following:
- provided more introductory content on what sources do/are for
- added examples of when sources are not required; also added more examples to other methods/types of sourcing (now each type has its own example)
- added new types of sources (policy, legislation)
- integrated examples of sources with step-by-step instructions on adding them
- explained more about how to determine whether a database property exists in Wikidata already or not
- explained more about the two ways of handling web pages as sources (i.e. when a webpage item exists and when one doesn't)
- minimized/reduced discussion of future/anticipated processes or features (e.g. monolingual data type property)
- removed some of the 'optional' qualifiers suggested for adding work items
- moved around content in an attempt to redesign page so it more closely matches the format of other Help pages
- added a screenshot
Please let me know if you have any concerns about these changes or suggestions on further improving the documentation. If you see something wrong (like a typo) or something that could be easily improved upon (like an example, confusing wording, or a label for a screenshot), please just go ahead and fix it - no need to comment here first!
Here are issues I would also like specific feedback on:
- is the distinction between "sources" and "references" useful? should we just eliminate it and use only one of the terms?
- do the web pages examples make sense and are they accurate? Specifically, is the example of United States Population Clock (Q3398022) correct in terms of how we should treat web pages that already have items in Wikidata?
- are there too few screenshots? What else should be in there?
- should there be more info on adding/creating properties for databases or general information on this process?
Thanks. -Thepwnco (talk) 17:24, 28 June 2014 (UTC)
- @Thepwnco: WRONG: you never use a book item in the reference section but only the edition of a book. A book item is used only in the book's edition item.
- source section -> book's edition -> book.
- That's why the term work is better than book. The book is more the hardcopy and what we need is an item for the work. The only question is do we need two item for a "book" with only one edition ? We have to treat that, but definitively don't use the "book" item in any reference/source section. Snipre (talk) 10:55, 30 June 2014 (UTC)
- @Snipre:, I am also in favor of dropping "book" and using "work" or "creative work". We should consider items that represent both an edition and the work. It is already happening, so better to tag them appropriately. In any case it is something that should be discussed more broadly.--Micru (talk) 12:45, 30 June 2014 (UTC)
- @Snipre, Micru: Hi both, thanks for your comments and for bringing attention to my mistake. I've since reverted mentions of 'book item' back to 'work item' and expanded the section to include a bit more info on the distinction between work items and edition items - my apologies for not getting this right the first time around. Please let me know if there's anything still troubling or problematic about this section. Cheers. -Thepwnco (talk) 00:31, 1 July 2014 (UTC)
- @Snipre:, I am also in favor of dropping "book" and using "work" or "creative work". We should consider items that represent both an edition and the work. It is already happening, so better to tag them appropriately. In any case it is something that should be discussed more broadly.--Micru (talk) 12:45, 30 June 2014 (UTC)
- The first look at the page simply made me turn away. It is probably the best example of all help pages on how to turn down beginners. It already starts in the introduction: "Within a reference statement, however, the data value is always a source" - pardon? In general, the amount of content is just too overwhelming. I think the step-by-step procedures make sense since these try to ensure consistency, but the page should be more dynamic, hiding most of the content initially. Maybe, the page should behave more like a questionnaire: The user selects the type of source he intends to add and is presented the necessary steps. I would not move the content to individual pages but have some mechanism to expand/collapse the individual steps. A question that struck my mind (partially off the actual content): Imagining queries... How would one know if something has no reference because it is regarded common knowledge? In my opinion, there needs to be a flagging mechanism marking statements common knowledge - but that is something technical. Random knowledge donator (talk) 09:18, 2 July 2014 (UTC)
- @Thepwnco: We need a section to indicate how to create an item for author. But this is not specific to the source definition but for any person. Do you plan to define a section for that or to link to a specific page ? Snipre (talk) 15:56, 2 July 2014 (UTC)
- @Snipre: Thanks for the suggestion. Personally I don't think that creating an item for an author (or any person, generally speaking) is complicated enough to merit its own documentation or section on this page. Given other comments here that suggest the Sources page already has enough content as it is, I'm also reluctant to add even more about secondary items of interest. Could you please explain what essential information would or should be included about creating an author item? Would just a link to how to create an item and mention of the need to create items for authors be enough? Thanks. -Thepwnco (talk) 19:08, 2 July 2014 (UTC)
- @Thepwnco: We need a section to indicate how to create an item for author. But this is not specific to the source definition but for any person. Do you plan to define a section for that or to link to a specific page ? Snipre (talk) 15:56, 2 July 2014 (UTC)
The usage of described by source (P1343) is unclear. Imho this property can be used for references like:
- Vladimir K. Zworykin (Q296545) → Brockhaus Enzyklopädie (19 ed.) (Q17377889) with qualificators P357 (P357): "Zworykin, Wladimir Kosma", volume (P478): "24" and page(s) (P304): "672"
I left a note at the talk page: Property talk:P1343#Usage. --Kolja21 (talk) 10:19, 20 July 2014 (UTC)
- I agree, but it should be section, verse, paragraph, or clause (P958) instead of P357 (P357) since the second is for titles of a whole work. I have added some aliases to p958 to make it more easy to find when typing "title" or "entry". The problem is that sometimes p1343 is also used with online encyclopaedias, which might be problematic, specially considering that we already have Reference URL.--Micru (talk) 10:31, 20 July 2014 (UTC)
Done I've corrected the example given in Property talk:P1343. Thanx for your feedback. --Kolja21 (talk) 16:41, 24 July 2014 (UTC)
Error in "Scientific, newspaper or magazine article" section?
[edit]AFAIK DOI (P356) and volume (P478) should be in article item, not publication item. Is correct? [15:10, 6. Aug. 2014 Sbisolo]
- I think it depends. Articles for sure have volume (P478), the publication item can have p478 or not. A journal may be published for over 100 years, but commemorative publication (Q933348) has for sure a clearly defined year of publication. Concerning p356 im not sure how it is used. But if there is an identifier you can connect without any doubt to an item it is ok to use it.--Giftzwerg 88 (talk) 18:19, 7 August 2014 (UTC)
- If an article surely has volume (P478), why not add this property to article item? DOI (P356) can identify a journal, an individual issue of a journal, an individual article in the journal, or a single table in that article. This property can be added both article and publication (when available). --Sbisolo (talk) 09:06, 11 August 2014 (UTC)
Wikidata:WikiProject Source MetaData
[edit]Interested in Sources and the future of Wikidata? See WikiProject Source MetaData - Mattsenate (talk) 13:39, 8 August 2014 (UTC)
Section about edition item
[edit]The correspondig sction reads like this: When sourcing statements, you should only ever use the edition item of a book. In some cases, it will be necessary to first add the work item to Wikidata in order to create an edition item. You can do so by following the steps below:
Check if the work item already exists in Wikidata. If an item is found, proceed to step 4. If the item is not found, create a new one for it and go to step 2. Add at least the following statements to the work item:
- instance of (P31) → book book (Q571)
- author (P50)
- P357 (P357) and P392 (P392) with original language of film or TV show (P364)
- there also should be a property for the "year of origin" or "year of first publication", or should we use inception (P571) - this, to me, would rather be the date the author finished the work, even if never published.
- perhaps the indication for PD should be applied here, as it is applied to the work, not the edition :) - but perhaps it is a little to early to speak about legal matters ;) --Hsarrazin (talk) 12:54, 19 August 2014 (UTC)
As I see it, Work items should not use Q 571 (book), but work (Q386724) instead. Prince Hamlet is a creative work, not a book (in fact it is a play, published in a book). Book is a suitable property for Book of Kells (Q204221), an item about a physical book you can touch. --Giftzwerg 88 (talk) 14:00, 17 August 2014 (UTC)
- @Giftzwerg 88: That is true, and I would like to make the change. Pinging @Kolja21: to see if he has any concern left after the last discussions that happened on this talk.--Micru (talk) 14:53, 17 August 2014 (UTC)
- work (Q386724) has a colorful variety of labels like "creative work" (en), "artistic work" (de) or "GND-C" (no idea what that means). These definitions will exclude some books and include many other things like paintings and compositions. Of cause we can add "creative/artistic", or "work" in general to every book item. But please keep in mind that we are talking about Help:Sources. So the main question will remain: What property should be used for books? Or should we give up books and only keep journals? A general property will undermine maintenances work. --Kolja21 (talk) 23:36, 17 August 2014 (UTC)
- However we must keep work items seperately from edition items. Anyone can clearly understand that some works may exist as books but also as electronic files or as voice recordings (as speeches). Some works of ancient authors don´t even exist any longer as a phsyical copy and we just have reports about the work or some scattered quotations as of Hortensius (Q1630211). So Hortensius is a work, not a book. According to Functional Requirements for Bibliographic Records (Q16388) we must discern between between creative works, editions and a special copy of an edition. Sometimes a book includes several works in one volume. Some works like encyclopedia (Q5292) come in several volumes. In rare cases however some works exist only in one copy and therefore the item for work, edition and copy can be the same, in this case instance of (P31) a book, instance of (P31) a scroll or whatever kind of physical remains exist. If you are not happy with work (Q386724), we just have to find something that resembles the point more clearly.--Giftzwerg 88 (talk) 12:14, 18 August 2014 (UTC)
- @Kolja21, Giftzwerg 88: It took me a while but I have separated work (Q386724) from creative work (Q17537576), and I have been cleaning the work tree. Now we also have a distinct manifestation tree. I hope it helps to make things more clear. There is still many tasks left to do because sometimes we have items representing both a genre and a type of work, when there should be two separate items. Some other times a "combines" property would be useful, for instance a dance work combines a musical work and a performance work.
- Perhaps the most confusing is that expression and manifestation are very closely related, but in general it is more like a gradient than as 4 clear-cut levels. Thanks to this model I have very clear now that on one extreme one can imagine the pure physical world, on the other pure information world, and in between one can describe how information takes shape as matter and how matter takes shape as information.
- The "group of entities" tree is also interesting. But I am missing some properties like the the proposed "has quality" property to say "<group of entities> has quality <group>"--Micru (talk) 14:45, 18 August 2014 (UTC)
- @Micru: Thanks for the clarification. I know it's a difficult case. Choose for "books" whatever you think will fit best and we give it a try. --Kolja21 (talk) 22:09, 19 August 2014 (UTC)
- However we must keep work items seperately from edition items. Anyone can clearly understand that some works may exist as books but also as electronic files or as voice recordings (as speeches). Some works of ancient authors don´t even exist any longer as a phsyical copy and we just have reports about the work or some scattered quotations as of Hortensius (Q1630211). So Hortensius is a work, not a book. According to Functional Requirements for Bibliographic Records (Q16388) we must discern between between creative works, editions and a special copy of an edition. Sometimes a book includes several works in one volume. Some works like encyclopedia (Q5292) come in several volumes. In rare cases however some works exist only in one copy and therefore the item for work, edition and copy can be the same, in this case instance of (P31) a book, instance of (P31) a scroll or whatever kind of physical remains exist. If you are not happy with work (Q386724), we just have to find something that resembles the point more clearly.--Giftzwerg 88 (talk) 12:14, 18 August 2014 (UTC)
- work (Q386724) has a colorful variety of labels like "creative work" (en), "artistic work" (de) or "GND-C" (no idea what that means). These definitions will exclude some books and include many other things like paintings and compositions. Of cause we can add "creative/artistic", or "work" in general to every book item. But please keep in mind that we are talking about Help:Sources. So the main question will remain: What property should be used for books? Or should we give up books and only keep journals? A general property will undermine maintenances work. --Kolja21 (talk) 23:36, 17 August 2014 (UTC)
Proposed update works-editions
[edit]A reminder of w:FRBR levels with our correspondence:
- work (work item): a distinct intellectual or creative work
- expression (edition item): a version of that work
- manifestation (edition item): the group of physical objects that carry the information considered the same version
- item (commons file): one exemplar
Proposed update for sourcing information.
- For work items: instance of any subitem of the work tree. For literature that might be literary work (Q7725634), for science it might be academic writing (Q17538129).
- Indeed, but I must say, I have some difficulty between work (Q386724) : distinct intellectual or artistic creation and creative work (Q17537576) : distinct artistic creation such as artwork, literature, music, and paintings :(-- Hsarrazin (talk) 12:42, 19 August 2014 (UTC)
- Again, there is no clear boundary as for when a work is "creative" or not, but there are some works that do require a high degree of creativity, like in all artistic fields. Is a "scientific paper" creative? Maybe, but it is mainly meant to be methodical. Or is a list of names considered creative? I hope not :)--Micru (talk) 13:32, 19 August 2014 (UTC)
- The question which actual work item to use is secondary. In doubt, I'd just use work (Q386724) and let other people refine. -- JakobVoss (talk) 12:26, 20 August 2014 (UTC)
- Again, there is no clear boundary as for when a work is "creative" or not, but there are some works that do require a high degree of creativity, like in all artistic fields. Is a "scientific paper" creative? Maybe, but it is mainly meant to be methodical. Or is a list of names considered creative? I hope not :)--Micru (talk) 13:32, 19 August 2014 (UTC)
- Indeed, but I must say, I have some difficulty between work (Q386724) : distinct intellectual or artistic creation and creative work (Q17537576) : distinct artistic creation such as artwork, literature, music, and paintings :(-- Hsarrazin (talk) 12:42, 19 August 2014 (UTC)
- For edition items: use "instance of:edition" (meaning "work version", represents the expression part) and optionally any subitem of the manifestation tree to represent the manifestation part.
In some cases it is necessary to separate the manifestation from the expression, for that we will need a new property (see property proposal: manifestation of). Note that all items are approximations and there is no perfect way of doing it, just a general way of getting closer to reality. By having a "roadmap", we can make an item develop in any necessary way, sometimes putting more emphasis modelling the work level, other times it might be more relevant to model thoroughly the manifestation level.
When there are aggregations/groups/collections of works, use any subitem of group of works. For aggregations/groups/collections of manifestations use any subitem of group of manifestation. Note that language is polysemous and that their meaning might be different depending on the context (for instance using "game" to refer to a "game work", a "game expression" or an instance of "game event"), but in our case we need to untangle that ambiguity, or at least offer a possible path to untangle it whenever necessary or relevant.
Here there is a table with a possible distribution of the properties. This is not a fixed structure, some properties might start in one item and be moved to the next as necessary (ISBN might be more suitable for the manifestation when such item exists). And sometimes there is no need to make a distinction, so the item can be modeled at just the expression-manifestation level (e.g. papers with no different versions), or at work level (wikipedia articles).
- you seem to use manifestation for the description of scans of books, but those would just be items, each scan being from a different book (from different libraries), even if it's the same edition... is that right or is there something I don't understand ?
- I changed it slightly to make more clear that "manifestation" is nothing material, just the "materialization", or said differently "how something abstract can be made some real object", the real object as you say is the "item" (a scan, a file, a book on your desk, etc).--Micru (talk) 13:32, 19 August 2014 (UTC)
In a way, it is like cutting a river with a knife, there are no exact boundaries about how and where to cut. Anyhow, does this update sounds reasonable? Any comment? WikiProject Books has more than 50 participants and couldn't be pinged. Please post on the WikiProject's talk page instead. WikiProject Periodicals has more than 50 participants and couldn't be pinged. Please post on the WikiProject's talk page instead.--Micru (talk) 08:26, 19 August 2014 (UTC)
- Thanks for that very important work that will, I hope, help understanding of what we try to do here :) --Hsarrazin (talk) 12:42, 19 August 2014 (UTC)
- The good thing is that these concepts can be used in every field without much effort (I hope) :)--Micru (talk) 13:32, 19 August 2014 (UTC)
- Please don't stick to the rather theoretical FRBR terminology, it has been discussed with much confusion and little practical outcome since 1998. In particular, "expression" and "manifestation" are not applicable without pain. Let's start with works as suggested above. Some works are unique (e.g. Mona Lisa (Q12418)) and some are available only in different forms, versions, variants, and editions (e.g. particular editions of a book). Implementation of edition is discussed at Wikidata:Requests_for_comment/References_and_sources#How_to_store_edition_data. -- JakobVoss (talk) 12:26, 20 August 2014 (UTC)
- @JakobVoss: I am aware of the difficulties when modelling items, as explained in the essay linked below. The only change is that before we were supposed to use "book" for everything (which is a mix of matter and information), and now it is "work" (purely information, abstract) or any sub-item, which makes more sense and it has application for more fields without constraint. As you say there are many ways to develop an item, it is just a matter of providing a clear path without too many rules or terminology.--Micru (talk) 13:17, 20 August 2014 (UTC)
- @Micru, JakobVoss: Next construction site, after a decision is made about the work level, is the property "distribution". Since it was originally made for video games and software it's usage for (printed) books, e-books, and audiobooks has to be specified, see Property talk:P437#Usage note. --Kolja21 (talk) 17:38, 22 August 2014 (UTC)
- @Kolja21: "Distribution" and "manifestation" are very tied together. I see the property "distribution" as a way of merging both what FRBR calls "expression" and "manifestation" into what we call "edition". I see no problem with using it for all domains, but perhaps a more specific label (or alias) would help, something like "physical format".--Micru (talk) 07:58, 28 August 2014 (UTC)
- @Micru, JakobVoss: Next construction site, after a decision is made about the work level, is the property "distribution". Since it was originally made for video games and software it's usage for (printed) books, e-books, and audiobooks has to be specified, see Property talk:P437#Usage note. --Kolja21 (talk) 17:38, 22 August 2014 (UTC)
- @Micru: Your proposition of changing the use of book is a good idea but your proposition of using a tree which is changing every day isn't in favour of the first proposition. If we want to go to a more accurate classification ok but we need to have a description of this new classification covering the same domain of the first use of "book". Can creative work be more clear than book ? I prefer a not perfect classification but simple than a more accurate but not defined classification. We need to fix thing, with a possible development, but by changing now we have to know the new classification scheme. Snipre (talk) 21:58, 27 August 2014 (UTC)
- @Snipre: By using a tree we give the chance to get closer to whatever users enter as "instance of" and still give results no matter if they look for "literary works", "poems" or "creative works". If we use only "creative work" or "work", then there is the need to add a further property for the type of work, which coincidentally is always a subproperty of "work". That the tree of works changes all the time is a good thing, what exactly are you afraid of?--Micru (talk) 07:58, 28 August 2014 (UTC)
- While this scheme seems logical, it does not help much when linking pages from different languages on Wikisource. Currently, we link translations with the original work, and that is useful. How do you intend to do that with WikiData? Regards, Yann (talk) 09:24, 27 August 2014 (UTC)
- @Yann: You can use either "based on" or "follows". Their meanings are different when used on a work, than when used on an edition.--Micru (talk) 07:58, 28 August 2014 (UTC)
- @Micru: Do you have an example? Regards, Yann (talk) 18:21, 29 August 2014 (UTC)
- @Yann: You can use either "based on" or "follows". Their meanings are different when used on a work, than when used on an edition.--Micru (talk) 07:58, 28 August 2014 (UTC)
Examples!
[edit]Please don't discuss, argue and propose without concrete examples. For instance Qur’an (Q428) is a work (Q386724) (via several subclasses) and The Koran Interpreted (Q7744922) is a translation of Qur’an (Q428) -- JakobVoss (talk) 12:49, 20 August 2014 (UTC)
- @JakobVoss: Done. You also have a nice example here: Diary of Anne Frank (Q6911). I also have written an essay which aims to bring understanding to the problem of modeling reality: Growing items. I hope it helps! --Micru (talk) 13:09, 20 August 2014 (UTC)
New template for citing sources
[edit]I have created Template:Cite item. It works like this {{Cite item|item number}} (or {{Cite item|item number|page=page number|lang=lang used for formatting}}. It is not complete yet, but that might already be useful. Note that it only works if the data are provided in the item provided. For example, if the item is about an edition, but the author's name is only in the item about the work, not the edition, the module does not follow the link provided in edition or translation of (P629). If this is needed, it would be relatively easy to implement. However, it may be good practice to provide all data in the item about the work, as it would also make the data easier to use by external users.
Note that Module:Cite/sandbox is also used Module:Wikidata. When the 'showsource' parameter is set to true, the source provided in stated in (P248) is shown. There are examples in the documentation page. --Zolo (talk) 10:38, 29 August 2014 (UTC)
- Yes, looks like there is still much work to do. "Pages" is translated with "Schutzstaffel" (SS) and I have no idea why the example Jamot and Wildenstein, Manet, catalogue critique, 1st edition (Q15619449) produces script errors. BTW: I've seen you've started a week ago Template:Cite book. Do you want to use it too? --Kolja21 (talk) 02:58, 30 August 2014 (UTC)
- Fixed, I was pretty sure I had seen "SS" as a plural for Seiten in German bibliographies :\.
- I was originally planning to add an "item" option to
{{Cite book}}
but then I realized that, for the beginning at least, it was easier to start with something new, and also that the book/article/.. distinction should be done automatically by the software. Cite book should be usable as a standard Template:Cite book (Q92570) though.--Zolo (talk) 07:52, 30 August 2014 (UTC)
source - imported from xxwiki
[edit]I do agree with the text at "Different types of sources" about "sourcing" from wikis, but in some cases of wiki-related statements it is acceptable and even desirable. Typical example is Property:P31 Q4167410 or Q4167836. Could I (or someone else) modify the text to reflect this? --Jklamo (talk) 12:43, 23 December 2014 (UTC)
- Jklamo No, you can use the URL property to link to the page describing this kind of syntax in WP:en for example. Snipre (talk) 14:43, 25 December 2014 (UTC)
Sourcing from books in Wikidata
[edit]Here's a notion:
- Currently, it's frustrating to try to add a citation to a book to a Wikidata item, because often not only are the work (Q386724) and version, edition or translation (Q3331189) not yet created, but often we also need to add the person items for the author/editor(s) and the organization item for the publisher.
- A wizard to add references is on the Roadmap, but is some way away.
Proposed short-term solution:
- Add a new property reference ISBN with qualifiers page number start and page number end.
- Reference ISBN would be an external link (similar to https://en.wikipedia.org/wiki/Special:BookSources/0870998595)
- A bot could check the all reference ISBNs for a matching edition in Wikidata, and if one exists, replace the "reference ISBN" link with a "stated in [item]" link.
This would allow users to add references to statements quickly without interrupting their workflow, and over time, as more books are added, these links would be self-correcting via the bot. It would also supply an immediately verifiable source via the external link.
Would this work? - PKM (talk) 20:30, 16 January 2015 (UTC)
- PKM Already requested see here. But this needs programmer, and we need the authorization of the database to import data using bots. Snipre (talk) 21:03, 16 January 2015 (UTC)
- Please note that an ISBN can be used for multiple editions. An ISBN without the number of edition is ambiguous. As a first step we should import the books linked to en:Template:BibISBN. This would establish a basic library catalog in WD. --Kolja21 (talk) 23:22, 16 January 2015 (UTC)
Great idea, but related question Scott_WUaS (talk) 16:45, 17 January 2015 (UTC): Scott_WUaS: in anticipating books' ISBNs written in all 7,870 languages (e.g. in Glottolog - https://glottolog.org/glottolog/language ), what might Wikidata also need to plan for?
References
[edit]Hello. When I get an article in Google Academics and want to use it as reference, how can I do it? I tried "site", "article", "pdf", but none of these worked.--MisterSanderson (talk) 22:09, 2 April 2015 (UTC)
- You have to use the existing properties not the properties you want. There is a list of properties specific to articles under Help:Sources#Scientific, newspaper or magazine article. Snipre (talk) 23:01, 2 April 2015 (UTC)
- So I need to create an article on Wikipedia for each pdf? This is absurd. They aren't this notorious.--MisterSanderson (talk) 23:58, 2 April 2015 (UTC)
- No, to create an item in WD you don't need to have an article in WP. See Wikidata:Notability: "3.It fulfills some structural need, for example: it is needed to make statements made in other items more useful." Reference hepls to increase the data quality. Just be careful to use high quality articles from peer-viewed journals and if possible articles which can be used to sources others statements. Snipre (talk) 09:41, 3 April 2015 (UTC)
- I don't know how to evaluate the quality of an article, I just take them from university sites using Google Scholar. But, so I need to create items for each article that I use as reference? It's not possible to just link the external address?--MisterSanderson (talk) 13:16, 3 April 2015 (UTC)
- If you don't provide the full description of the reference how do you want to display the source in WP when your value will be used ? Then URLs change quite easily so yes you can source with only a URL but this adding will be useless from display point of view. Snipre (talk) 16:28, 3 April 2015 (UTC)
- @MisterSanderson -- I suggest using Zotero; if you have a DOI or ISSN, or a machine-readable PDF, or the original webpage (if it's decently designed, like an institutional repository, a publisher's webpage, etc.), Zotero will automatically generate a reference, and export it formatted as a Wikipedia Citation Template. This is being integrated into VisualEditor by User:Mvolz. Automated export to Wikidata should also be possible -- anyone know about this? I wrote Wikipedia:Bot requests#DOI bot. HLHJ (talk) 12:29, 8 July 2015 (UTC)
- @User:HLHJ You can't use citation templates to create references on wikidata, sadly. It all has to be done manually by creating the item in wikidata. Mvolz (talk) 18:35, 12 July 2015 (UTC)
- @MisterSanderson -- I suggest using Zotero; if you have a DOI or ISSN, or a machine-readable PDF, or the original webpage (if it's decently designed, like an institutional repository, a publisher's webpage, etc.), Zotero will automatically generate a reference, and export it formatted as a Wikipedia Citation Template. This is being integrated into VisualEditor by User:Mvolz. Automated export to Wikidata should also be possible -- anyone know about this? I wrote Wikipedia:Bot requests#DOI bot. HLHJ (talk) 12:29, 8 July 2015 (UTC)
- If you don't provide the full description of the reference how do you want to display the source in WP when your value will be used ? Then URLs change quite easily so yes you can source with only a URL but this adding will be useless from display point of view. Snipre (talk) 16:28, 3 April 2015 (UTC)
- I don't know how to evaluate the quality of an article, I just take them from university sites using Google Scholar. But, so I need to create items for each article that I use as reference? It's not possible to just link the external address?--MisterSanderson (talk) 13:16, 3 April 2015 (UTC)
- No, to create an item in WD you don't need to have an article in WP. See Wikidata:Notability: "3.It fulfills some structural need, for example: it is needed to make statements made in other items more useful." Reference hepls to increase the data quality. Just be careful to use high quality articles from peer-viewed journals and if possible articles which can be used to sources others statements. Snipre (talk) 09:41, 3 April 2015 (UTC)
- So I need to create an article on Wikipedia for each pdf? This is absurd. They aren't this notorious.--MisterSanderson (talk) 23:58, 2 April 2015 (UTC)
Too much work to add a simple single reference. I give up, don't want to add them anymore.--MisterSanderson (talk) 01:36, 12 July 2015 (UTC)
- User:MisterSanderson It's acceptable to just add the link to the google scholar page instead of creating an item for the journal article. The property you are looking for when adding the references is 'reference url', and then just add the link to the website where the article is. I agree that creating an entire item just to use as a reference is a lot of work! Mvolz (talk) 18:35, 12 July 2015 (UTC)
- Requesting a bot. [7] HLHJ (talk) 14:41, 16 July 2015 (UTC)
Scientific articles : labels (probably) should be author-date
[edit]Help:Label states that “the label is the most common name that the item would be known by.” But scientists never refer to articles by their titles, they do it primarily by their authors and publication date, in combination with the name of the journal where the article is published and/or the themes of the contents. This is true even for exceptionally public-oriented pieces such as The age of the Earth in the twentieth century: a problem (mostly) solved (Q15545344). Scientific article titles are not intended to be used for reference.
Plainly speaking, the current practice is not conventional, and arguably not functional.
Tinm (talk) 21:48, 29 August 2015 (UTC)
- @Tinm: The name of the author and the pulication date are not enough to distinguish between two articles: some authors publish more than one article per year. So your proposition is not coherent because this is not an unique way to identify an article. Then if you include the name of the journal this is again the same: there are some cases where some articles are splitted into two parts and published in two diffrent issues of a journal: same authors, same year, same journal but several articles. So again this is not sufficient to distinguish two articles.
- Then you confuse between reference in the text of the article and reference in the reference section of an article. The reference section is requiring more than the name of the authors, the publication date and the journal. You have to add pages, issues and/or volume number of the journal and most of the time you add the title too. What is the unique identifier between all these data ? The title. By comparing titles you can have a pretty high degree of correspondance between the element and an article. Author, publication date, journal and issue give you 100% correspondance but you need 4 informations. Title gives 95-99% with only one solution. And as people are lazy they choose the fatest way so title is a good compromise and in case of two articles having the same title you can do the difference by adding a second information. Snipre (talk) 12:37, 30 August 2015 (UTC)
- Hi Snipre. You don't answer my main point. “the label is the most common name that the item would be known by” and it is not intended to be unique anyway. It is not made to be an identifier. It is made to be the only information (with the true identifier, e.g. Q15545344) that will be displayed on other pages, and it is regarding that use that using titles is disfunctional. Tinm (talk) 14:51, 30 August 2015 (UTC)
Authority control databases as sources—how to?!
[edit]On Help:Sources#Databases. According to this section, a source to a database that is already linked by an Authority control property (subsequently called <Database property>
) shall be generated using stated in (P248) and additional qualifiers <Database property>
, title (P1476), language of work or name (P407), and publication date (P577)/retrieved (P813).
It seems to me that we generate redundancies here: the value of the qualifier <Database property>
is already available by the source-independent statement of the item, and (typically) language of work or name (P407) is a characteristic of the database rather than an item- or source-specific thing. Is it somehow possible to avoid these redundancies by direct in-source usage of <Database property>
from the statement in the item itself? language of work or name (P407) and stated in (P248) can in principle be derived from <Database property>
.
Reasons why this seems to be a good idea:
- if for whatever reason the identifier of the
<Database property>
changed in future (hopefully not that often, but who knows…), we would only have to change one easily accessible statement; - if
<Database property>
is used as a qualifier in a statement, it does not create a conveniently clickable link in the Wikidata frontend at the moment; the sourced information is therefore not directly accessible, one has to navigate to the<Database property>
of the item first; - in case of many statements of one item that all can be sourced with the same database link, this would considerably reduce the the data overhead.
- … (even more?)
So is there any way how we could optimize our sources in these situations? Re-usage of Authority control properties with a parameter (or qualifier?) “access date” should just do the job here.
Comments are welcome. —MisterSynergy (talk) 18:37, 29 September 2015 (UTC)
- We can delete the requirement of the language of the database because we can retrieve the language from the title of the dataset which require a language.
- But for the rest you assume a particular case and not the general case. Please consider the following example: a CAS number for sodium salt form of a chemical (NaX) is provided in the dataset of the acid form (HA) of the chemical in the PubChem database. And no specific dataset for NaX exists in the PubChem database so no PubChem ID exists for the NaX form. Snipre (talk) 13:16, 2 March 2016 (UTC)
- @Snipre: Finally an answer—thank you very much! :-) I partially understand your concerns, but they are somewhat abstract without a specific example. Could you please provide links to an item/CAS entry where this is the case? Detailed explanations are not necessary (for me), I will figure that out by myself. Thanks and regards, MisterSynergy (talk) 13:45, 2 March 2016 (UTC)
- Take example of 4-butanol on chemspider. There are two CAS numbers for this entry:
- * 78-92-2: for normal mixture of 4-butanol
- * 4712-39-4: for mixture of 4-butanol with deuterium instead of hydrogen
- If we have a item in WD for the second mixture (we don't have it currently but we have deuterated molecules,see deuterated ethanol (Q1101193)) we should be able to point to the correct entry by providing the ID of Chemspider database.
- In summary we should have a way to link to an entry which mix differents subjects and collects data about all these subjects even if all these subjects have a seperate item in WD.
- Your assumption is a similar granularity between WD and other databases but this is not always the case. This is not usual just think about the Bonnie and Clyde problem. WD uses as solution to create as many items as necessary to express the possible combinaisons. But this is not the case for all others databases. Just providing the name of the database is just like providing the title of a book: the mention of the page is very helpful to find the correct passage of the citation. Snipre (talk) 19:42, 2 March 2016 (UTC)
- @Snipre: Finally an answer—thank you very much! :-) I partially understand your concerns, but they are somewhat abstract without a specific example. Could you please provide links to an item/CAS entry where this is the case? Detailed explanations are not necessary (for me), I will figure that out by myself. Thanks and regards, MisterSynergy (talk) 13:45, 2 March 2016 (UTC)
Circularity of database sourcing
[edit]w:en:User:Gilliam schoolblocked w:en:user:195.195.152.11, a serial vandal, back in 2014. I just discovered that someone at that IP changed the birthdate at w:en:Carola Dunn way back in 2005. Looking at their contribs, it appears that changing birthdates on biographies was a favourite passtime back then. I fear that other BLPs may still be damaged. Worse, it appears that the data has promulgated out through Q5044629 Wikidata to corrupt ISNI:0000000073737522 and VIAF:35242654. I suppose it is just possible that LCAuth:n82220682 is wrong, or the source that it used was wrong, but I've adopted and cited its date anyway, in preference to the IP's edit. I hate to say it, but someone's going to have to go through all the IP's contribs and check each of the affected articles. I've raised that point at w:en:Wikipedia:BLP/N for correction.
The larger question of the interplay between databases is what I'm more concerned about. When external databases such as ISNI start to use uncited data on WP or WKD as if it were more reliable than the LCAuth record, we now have a circular referencing problem that can only be corrected by some ugly manual interventions. We should not be using WP as a reference for WKD statements. There must be a way for external database admins such as those at VIAF and ISNI to discern the actual reference behind statements and distinguish those which are not referenced at all. Data provenance matters. LeadSongDog (talk) 16:56, 5 April 2016 (UTC)
- @LeadSongDog: I strongly agree with you that "we should not be using WP as a reference for WKD statements". Representing sources as items should allow us to perform batch operations or prioritize the review of statements with dubious sourcing.--DarTar (talk) 16:23, 9 April 2016 (UTC)
Self-references and an item-datatype missing?
[edit]When @Väsk: worked with some film-items here. (S)he intended to show that a statement is sourced by closing credits (Q1553078). That as a source-model looks fine to me. But how do we technically do that? I would prefer to see "chapter:closing credits (Q1553078)", but the chapter-property do not support items, only text. Väsk used applies to part, aspect, or form (P518), but that looks wrong to me. Should we propose a new source-property or is there something I am missing? -- Innocent bystander (talk) 06:15, 2 January 2016 (UTC)
- That's an interesting case. If we can use an item to self-reference its own claims, we can use "stated in" with the item number containing the claim. This can be sufficient: for Harvard citation for example, no need to specify that a name of an actor is displayed at the end in the credits (see here). Snipre (talk) 20:08, 2 March 2016 (UTC)
Offline database
[edit]In Anette Abrahamsson (Q3616645) I have used an offline database as a source. This page does not give much guidance for such sources! -- Innocent bystander (talk) 05:40, 9 April 2016 (UTC)
- Innocent bystander You have already two examples of databases citation. I think it is possible to use the corresponding properties for your case: stated in (P248), title (P1476) and retrieved (P813). Snipre (talk) 21:14, 9 April 2016 (UTC)
- Sveriges befolkning 1970 (Q23764969) has title, publisher and publication date, the normally required stuff. But how do I tell where in the database? A database can have identifiers, even if it is not online and can be linked. I tried chapter (P792) but it does not look right. The same kind of problem can probably be identified when an offline encyclopedia (Q5292) is used as a source. The main source in the sv-article about Abrahamsson is exactly that, a CD-encyclopedia from 2000. I have no access to that CD, but I got help on svwiki to find some of the information in Sveriges befolkning 1970 (Q23764969), a database containing the civil registry in Sweden from 1971/72. -- Innocent bystander (talk) 06:19, 10 April 2016 (UTC)
- Innocent bystander If "A database can have identifiers", do as usual: create the corresponding property for the ID of the database. No need of online database to create a property. Snipre (talk) 07:40, 11 April 2016 (UTC)
- I think you didn't understand the correct process:
- * Use stated in (P248) for the database item
- * Use title (P1476) for the name of the article in the database. In your example you should use this property instead of chapter (P792) to source the date of birth.
- * Use retrieved (P813) or publication date (P577)
- * Use the ID property if it exists (for online or offline database)
- I corrected the reference section for your example. Snipre (talk) 07:51, 11 April 2016 (UTC)
- @Snipre: The title is here for the title of the database, isn't it? That statement is in the P248-item. And that item also has a publication date, therefor a retrieved date is redundant. A "retrieved date" is more or less always redundant when the data is stable, as it is in an article, book or like here in a CD/DVD. If it would have been an online database, where the data can be changed any day, a "retrieved date" is essential. Take a look in the first note in sv:Anette Abrahamsson#Noter. It now has two "title" one unlinked and one linked. And it has both "publication date" and "retrieved date".
- "Chapter" was not a good choice, if I would have thought that, I would not have come here.
- Maybe I should propose a new ID-property for offline databases? -- Innocent bystander (talk) 08:26, 11 April 2016 (UTC)
- @Innocent bystander: No. "title (P1476) → the title of the dataset in the database". The title of the database is provided by the label of the database item.
- publication date (P577) is the first information to look for but this is sometimes hard to find for online database so if you don't have any publication date of the database or for the dataset, use retrieved (P813). You don't need to use both but at least one of these two properties to provide time information.
- No general property for database ID: if you want to create a ID property for your offline database, create a specific one for that database. That's what people want at the end so better create specific properties now instead of a general one and to see later a mix of general and specific properties. Snipre (talk) 11:20, 11 April 2016 (UTC)
- Having title P1476 both in the database-item and the reference-part of the claim looks terribly difficult to decrypt for a multipurpose module!
- Are you aware of how many databases there are, only about the Swedish civil registry? -- Innocent bystander (talk) 16:56, 11 April 2016 (UTC)
- Innocent bystander If "A database can have identifiers", do as usual: create the corresponding property for the ID of the database. No need of online database to create a property. Snipre (talk) 07:40, 11 April 2016 (UTC)
- Sveriges befolkning 1970 (Q23764969) has title, publisher and publication date, the normally required stuff. But how do I tell where in the database? A database can have identifiers, even if it is not online and can be linked. I tried chapter (P792) but it does not look right. The same kind of problem can probably be identified when an offline encyclopedia (Q5292) is used as a source. The main source in the sv-article about Abrahamsson is exactly that, a CD-encyclopedia from 2000. I have no access to that CD, but I got help on svwiki to find some of the information in Sveriges befolkning 1970 (Q23764969), a database containing the civil registry in Sweden from 1971/72. -- Innocent bystander (talk) 06:19, 10 April 2016 (UTC)
WikiCite applications closing soon
[edit]A reminder that applications to attend WikiCite 2016 – an event that should be of interest to Wikidatans active on source-related work – close this Monday April 11. We have a limited number of travel grants to support qualified participants. If you wish to join us in Berlin to participate either in the data modeling or engineering effort, please consider submitting an application --DarTar (talk) 16:09, 9 April 2016 (UTC)
archive URL (P1065) for Web pages
[edit]Another useful qualifier for web pages can be archive URL (P1065), for obtaining an archived version of the page in obsolete URLs. I suggest to adding this to Help:Sources#Web page item 4 (if needed, other additional qualifiers). —surueña 13:10, 12 May 2016 (UTC)
- surueña Good idea. Snipre (talk) 19:28, 12 May 2016 (UTC)
- I think we have an archive-date-property somewhere, but I do not remember where! -- Innocent bystander (talk) 07:47, 13 May 2016 (UTC)
- Just added property archive URL (P1065) to Help:Sources#Web page, as well as archive date (P2960) as suggested by Innocent bystander, property which has been recently created. —surueña 07:51, 28 July 2016 (UTC)
- I think we have an archive-date-property somewhere, but I do not remember where! -- Innocent bystander (talk) 07:47, 13 May 2016 (UTC)
No save button and other issues
[edit]I have tried to add a source to the use of population (P1082) in Favrskov Municipality (Q512550). When I edit a new source the number of sources is indicated as "0+1" where the "+1" part seems to mean unsaved sources, but there is no save button. How do I save the source?
My source is Statistikbanken (Q12336913), which is an online database with many different tables. I tried to use stated in (P248): Statistikbanken (Q12336913) and title (P1476): "BY1: Folketal 1. januar efter byområde, alder og køn" (Database table title), but I think that it would be nice with a "table" property. I have also wondered if I should create an item for the that table alone. The table could be used as source in 1000+ items for Danish municipalities and towns.
Another thing I have wondered about is if it is necessary to use retrieved (P813) when the statements have a point in time (P585) qualifier. The source table have data for several years, newer values is added each year but I don't expect the existing values to the changed.
One more question: What is the difference between stated in (P248) and imported from Wikimedia project (P143)? When is each of them used?
I appreciate any comments. Thank you, Dipsacus fullonum (talk) 08:47, 27 July 2016 (UTC)
- my understanding on your questions:
- you save the source when you save the entire statement - the save link just to the right of the main statement value should be available once you have added a valid source entry.
- If the table is a stable entity provided by the service that will be a useful source for many entries, by all means create a new item and use that for stated in (P248).
- In this case it sounds like retrieved (P813) isn't necessary but doesn't hurt to help fully define the reference (sometimes tables like this do change later)
- Definitely use stated in (P248) for this. I think imported from Wikimedia project (P143) is really only for bot-based imports of information from language wikipedias, stated in (P248) represents external sources which are much preferable.
- I hope this addresses your concerns? ArthurPSmith (talk) 13:53, 27 July 2016 (UTC)
- I use stated in (P248):Småorternas landareal, folkmängd och invånare per km2 2005 och 2010, korrigerad 2012-10-15 published by Statistics Sweden in this manner. It is also an excell-spreadsheet, which I have downloaded to my computer.
- And as ArthurPSmith says, the use of retrieved (P813) is more related to if the spreadsheet is stable or not.
- And you "save" the reference by saving the whole claim at the top of the claim: "Population: 123456 SAVE, remove, cancel". -- Innocent bystander (talk) 14:33, 27 July 2016 (UTC)
- Thank you very much for your answers. I had save button image, but it was grayed out and not clickable. Today it worked, so it probably was some kind of Javascript/browser error. Best regards, Dipsacus fullonum (talk) 12:45, 29 July 2016 (UTC)
- @Dipsacus fullonum: Yes, that has happened to med too. But the most common reason it is "grayed out and not clickable" is that something in all qualifiers/references is missing. -- Innocent bystander (talk) 13:48, 29 July 2016 (UTC)
How to reference to Scientific Archives
[edit]Dear colleagues! I am working with Scientific Archives of Karelian Research Centre of RAS. I have the following problem.
I have information from this archive (precise reference is "Fonds: 2, Series: 35, File: 3001, Item: 2") that the scholar Vladimir Ermakov knows English and German. Now I described it in the Wikidata in the following way:
- I added property languages spoken, written or signed to Vladimir Ermakov,
- I added property stated in,
- I added sub-property section, verse, or paragraph
- (main problem) I added plain text with reference to the document in the archive: "Fonds: 2, Series: 35, File: 3001, Item: 2 (In Russian: ф. 2, оп. 35, д. 3001, л. 2)".
If you look at w:ISAD(G) standart and this document, then you see that "Fond, Series, File, Item" are mandatory fields for any documents in archives.
Please, could you add add the following properties at Wikidata:
- "stated in archive" with four qualifiers:
- Fond (In Russian: "Фонд", short name for reference: "ф.")
- Series (In Russian: "Опись", short name: "оп.")
- File (In Russian: "Дело", short name: "д.")
- Item (In Russian: "Лист", short name: "л.")
For example:
- FAA. Fond 111. Series 5. File 669. Items (pages) 342, 347–347 overleaf, 374.
- RGADA. Fond 120. Series 7. File 160. Items 1-24, 26.
- Russian State Historical Archive. Fond 8. Series 2. File 8. Items 8, 15, 20.
Thank you! -- Andrew Krizhanovsky (talk) 08:35, 24 December 2016 (UTC)
how to re-use a single quote from a single source as a reference for multiple statements?
[edit]Is it necessary to copy all of the fields of a reference for one statement into a new reference for another statement? Or is there a way to more easily reference the same source (with the same quotation text or perhaps different or none) in support of multiple statements (perhaps for the same item or different ones)? DavRosen (talk) 23:16, 10 February 2017 (UTC)
- At Special:Preferences#mw-prefsection-gadgets you find a gadget called DuplicateReferences. Once activated it allows to copy and paste complete references within an item. —MisterSynergy (talk) 05:48, 11 February 2017 (UTC)
- But what about copying the same reference to statements in different items? For example it possible to create a reference item within wikidata and simply refer to it more than once? DavRosen (talk) 15:43, 11 February 2017 (UTC)
- You can create an item about a book, a scientific article or a technical report and then use it as reference in other items. Just read the examples of Help:Sources.
- But you can't copy a reference between 2 different items i the same way as DuplicateReferences. Snipre (talk) 17:04, 11 February 2017 (UTC)
- But what about copying the same reference to statements in different items? For example it possible to create a reference item within wikidata and simply refer to it more than once? DavRosen (talk) 15:43, 11 February 2017 (UTC)
P792/P958/P478 alternative where the target is an item
[edit]I have a use of a property similar to chapter (P792)/section, verse, paragraph, or clause (P958)/volume (P478) but where the target is an item rather than a string. Looking around I haven't been able to find a good candidate. What I need it for is to indicate which section/part of an electronic report supports the statement. Defining them sections/parts as items (with some suitable instance of (P31) = <section of REPORTSERIES>) makes more sense to me than just providing the title of the section. And since the sections/parts are reused for the whole report series creating unique items for each report+section feels like overkill. Any suggestions for a good property for this (or confirmation that one needs to be proposed) would be much appreciated. /André Costa (WMSE) (talk) 12:44, 28 February 2017 (UTC)
- stated in (P248) is I think our main reference property that points to items, is it not suitable in this case? ArthurPSmith (talk) 14:45, 28 February 2017 (UTC)
- Oh, I see, you're trying to have 2 properties, one for the report identifier, and one for the section within the report which is common across a series... Hmm. Sounds like we might need a new property for this - "report section" perhaps? ArthurPSmith (talk) 14:47, 28 February 2017 (UTC)
- That is exactly he case. /André Costa (WMSE) (talk) 17:17, 28 February 2017 (UTC)
- I don't understand: is the section an identified document with is own ISBN or other kind of identifier ? Snipre (talk) 23:43, 28 February 2017 (UTC)
- Not in my case. The report is really an envelope of other files and folders. In addition to giving the main report and a link to the actual (internal) data file it would be good to indicate which section that data file lived in since different sections naturally carry different data. I've made a quick overview below. WOrth remembering is that the sections (2a, 2b, 3) are repeated in each country report and the last section (3) can be repeated multiple times within one report (once per geographical subdivision).
- 1. WFD 2016 reporting for Finland
- 2a. The RBDSUCA section of the report
- 2b. The River Basin District (RBD) section of the report
- 3. A Surface Water Body section of the RBD section (there are multiple)
- 4. The xml file to the actual data for the above section (Note that this is 13 MB so open at your own risk).
- 3. A Surface Water Body section of the RBD section (there are multiple)
- 1. WFD 2016 reporting for Finland
- Not in my case. The report is really an envelope of other files and folders. In addition to giving the main report and a link to the actual (internal) data file it would be good to indicate which section that data file lived in since different sections naturally carry different data. I've made a quick overview below. WOrth remembering is that the sections (2a, 2b, 3) are repeated in each country report and the last section (3) can be repeated multiple times within one report (once per geographical subdivision).
- I don't understand: is the section an identified document with is own ISBN or other kind of identifier ? Snipre (talk) 23:43, 28 February 2017 (UTC)
- That is exactly he case. /André Costa (WMSE) (talk) 17:17, 28 February 2017 (UTC)
- Oh, I see, you're trying to have 2 properties, one for the report identifier, and one for the section within the report which is common across a series... Hmm. Sounds like we might need a new property for this - "report section" perhaps? ArthurPSmith (talk) 14:47, 28 February 2017 (UTC)
- Given the above example I would like to indicate 1 (the main report, and country), 4 (reference url), 3 (that it comes from a Surface Water Body section of the WFD reports). In addition I would also add the publishing date to distinguish if there were multiple releases (as is the case for Finland). Now rather than providing the url for 3 (which doesn't give any extra info for which 4 is not better) what I would like is to have that be a <Surface Water Body section of a WFD report> item.
- Hope this clarifies it a bit? /André Costa (WMSE) (talk) 09:04, 1 March 2017 (UTC)
Website name
[edit]Help:Sources#Web_page is missing the "website name", like "cite web" on Wikipedia. There is a publisher field in the example, but that is insufficient IMO. Also, could someone post an example or screenshot of a properly cited fact? Thanks. SharkD (talk) 11:49, 1 June 2017 (UTC)
Using "missing in source" as a source
[edit]In Krogsta (Q18291393) I have added that it was an instance of:minor locality in Sweden (Q14839548) from 1990-12-31 -- 1995-12-30. That it started is 1990 is sourced by page 15 in Småorter 1990 (Q20087097) and that it ended in 1995 is sourced by that this entity is missing in Småorter 1995 (Q20087135), who is an authority in this subject. If this was Wikipedia, I would probably written a note that the item is missing in "Småorter 1995", but how do I express that here at Wikidata? -- Innocent bystander (talk) 06:07, 14 July 2017 (UTC)
Serialized magazine article?
[edit]How would one source/describe a magazine article which was serialized across many issues of a magazine?
Example, pg. 326, ": https://play.google.com/books/reader?id=1eXNAAAAMAAJ&printsec=frontcover&pg=GBS.PA326
—Hawke666 (talk) 07:08, 23 August 2017 (UTC)
- How about create an item for the article, and add statements that specify each serialized part with a serial number (P2598) qualifier? ArthurPSmith (talk) 15:57, 23 August 2017 (UTC)
- That's a thought, though I didn't mean serialized in the sense of "having a serial number applied" but in the sense of "publish or broadcast (a story or play) in regular installments". As such I'm not sure whether a serial number (P2598) makes sense. Even if it does, how do I correctly connect the article to the publication? [[8]] says it should be linked with issue (P433) publication date (P577) and page(s) (P304) but there are many of all of those — if an article was spread across "pages 5-6 of issue 1 in January 1903" and "pages 10-13 of issue 2 in February 1903" and "page 2-7 of issue 3 in March 1903", how do I connect all those together without adding an item for every issue? —Hawke666 (talk)
- Is it really necessary to add all data for each publication ? Just a question. Snipre (talk) 19:33, 23 August 2017 (UTC)
- I don't know what’s necessary; that’s why I’m asking about how to source this properly. Current schema seems to assume that an article appears in its entirety in only one issue. —Hawke666 (talk) 20:18, 23 August 2017 (UTC)
- In your case, this is like an new edition of the article: we should create an item for each reprint of the article in order to keep a similar model compared to book. My first opinion is create an article for the article you need. If you need to source something with one article create the corresponding article. If somebody use another "edition" of the article, he will create an new item. This is perhaps stupid, but this is correct according to the model we want to apply for books. Snipre (talk) 22:34, 23 August 2017 (UTC)
- You may be misunderstanding though, it's not reprinted in multiple issues. Instead a portion of the article (maybe a page or two) appears in each issue. Each part begins "Continued from page xxx" (referring to an installment in a previous issue) and ends with “To be continued”. —Hawke666 (talk) 20:14, 24 August 2017 (UTC)
- In your case, this is like an new edition of the article: we should create an item for each reprint of the article in order to keep a similar model compared to book. My first opinion is create an article for the article you need. If you need to source something with one article create the corresponding article. If somebody use another "edition" of the article, he will create an new item. This is perhaps stupid, but this is correct according to the model we want to apply for books. Snipre (talk) 22:34, 23 August 2017 (UTC)
- I don't know what’s necessary; that’s why I’m asking about how to source this properly. Current schema seems to assume that an article appears in its entirety in only one issue. —Hawke666 (talk) 20:18, 23 August 2017 (UTC)
- Is it really necessary to add all data for each publication ? Just a question. Snipre (talk) 19:33, 23 August 2017 (UTC)
- That's a thought, though I didn't mean serialized in the sense of "having a serial number applied" but in the sense of "publish or broadcast (a story or play) in regular installments". As such I'm not sure whether a serial number (P2598) makes sense. Even if it does, how do I correctly connect the article to the publication? [[8]] says it should be linked with issue (P433) publication date (P577) and page(s) (P304) but there are many of all of those — if an article was spread across "pages 5-6 of issue 1 in January 1903" and "pages 10-13 of issue 2 in February 1903" and "page 2-7 of issue 3 in March 1903", how do I connect all those together without adding an item for every issue? —Hawke666 (talk)
- @Hawke666: - i have a lot of such articles published in the German 19. century magazine Die Gartenlaube (Q655617). They published a lot of such 'long-reads' also for economic reasons to held the readers interessted in buying the next magazine. The bibliographic description is a bit tricky i thought too. As i am descrbing those articles based on the German Wikisource edition all those splitted articles where presented in Wikisource as one page, so it makes sense to describe this as one bibliographic unit. in the wikidata record i created several statements for page(s) (P304)pageNumberissue (P433)
IssueNumber{{{5}}} and vice versa different issue statment with page qualifiers (e.g. Der Schutzgeist des Hauses (Q19165151)), so it is at least clearer, which page number belongs to a certain issue number. Of course it is semantically not totally clear, that this bibliographic records was published in a splitted way and not as copy in the given issues. so i have thought about adding serial number (P2598) for the different parts, but i have my doubts that this will make it much more clearer; often the different parts have something like subtitles with a numbering style like "1. part, 2. part, end" etc. Maybe it would be also possible to add those values also in the statements with a qualifier like "named as" or "subtitle. --Mfchris84 (talk) 09:00, 12 February 2021 (UTC) - Here is a query for all Gartenlaube-articles which were published in a splitted way: Try it!
SELECT ?artikel ?artikelLabel (COUNT(?heft) AS ?countParts) WHERE { ?artikel wdt:P1433 wd:Q655617; wdt:P433 ?heft. SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". } } GROUP BY ?artikel ?artikelLabel HAVING (?countParts > 1) ORDER BY ASC(?countParts)
- --Mfchris84 (talk) 09:00, 12 February 2021 (UTC)
statements in described by source (P1343)
[edit]I think that most of them can be used as sources for P31 and P279 and other claims. d1g (talk) 13:11, 7 September 2017 (UTC)
Changes to "When to source a statement"
[edit]In some cases sources are not required:
1. When a value is common knowledge, and it has not been disputed.
We should allow descriptive references about P31|Q128207.
It is better to pick historic references for such claims. Over time we should source inventions when possible. But something like "knife P279 tool" is difficult to source indeed.
2. "When the item has a statement that refers to an external source of information"
We should indicate exceptions when property for specific site is not created and better to have URL than no ID at all.
3. When the item itself is a source for a statement.
This point should allow "first occurrence" and "first definition at page" and "first description at page" uses and something similar.
d1g (talk) 13:24, 7 September 2017 (UTC)
Proposal on citation overkill
[edit]Please see Wikidata:Project chat#Proposal on citation overkill for a discussion about whether it is good practice to add additional sources to a property that already has high quality online sources. Jc3s5h (talk) 12:15, 25 October 2017 (UTC)
Which language property in references?
[edit]Today User:Tubezlob changed the language property to use in references from P2439 (P2439) to language of work or name (P407), which basically formally invalidates countless numbers of references. In the past I spent a lot of effort into proper use of sources according to the models described in this page, so I am not happy with this change. What can we do to settle this model in a way that we do not have to touch it again in future? --MisterSynergy (talk) 21:15, 18 November 2017 (UTC)
- Hi MisterSynergy, I changed this property to unify with the WikiProject Books. It is the only property that they use. Naturally this is a good thing to discuss here if you do not agree with that. Tubezlob (🙋) 21:26, 18 November 2017 (UTC)
- Note that there has been a long-running proposal to merge the various language properties here - maybe you both should weigh in about this particular consequence also? I don't see a strong need for multiple properties like this which indicate essentially the same thing. ArthurPSmith (talk) 00:51, 20 November 2017 (UTC)
- Yes, I am aware of the existence of that discussion, but not of all its details. As far as I know, language of work or name (P407) and original language of film or TV show (P364) should be (partially?) merged in the domain of books, but neither P2439 (P2439) nor Help:Sources seem to be affected of that change. Another approach could be to have a (new) language property specifically for references. I have no idea whether this is desired and whether it had any advantages over the current situation.
- However, since particularly Wikipedia users rely on what’s written here on this page, we should make sure that actual references contain the properties described here, not others which might or might not be related. If one writes a module to automatically pull Wikipedia references from Wikidata, it is a mess to implement all types of (undocumented) workarounds and property switches into the code. I have no idea how many references appear broken if someone now started coding based on the current version of this page, while lots of references are not formed according to it. —MisterSynergy (talk) 09:30, 20 November 2017 (UTC)
Geographical coordinates from image metadata
[edit]If we add geographical coordinates to a listing based on the metadata of an image pertaining to that listing, would we add a source for this statement, and, if so, what kind? Thanks. ARR8 (talk) 21:59, 10 September 2018 (UTC)
- Hmm, coordinates of the point of view (P1259) is related, but in the other direction. Maybe a new property needed here? I can't think of an existing one for this! ArthurPSmith (talk) 19:26, 11 September 2018 (UTC)
- Thank you. Guess I'll ask around some more in a few other pages before learning the property proposal process. ARR8 (talk) 14:36, 16 September 2018 (UTC)
“Reference section”
[edit]Under Books, no. 10, the help page states “Add any additional properties to the reference section which can help when verifying the value of the statement.”
How do I do that? There is no “References” subheading in the Wikidata item. A reference field attached with “stated in“ doesn’t have any interface to add additional properties that I can find.
What do I click to do this? —Michael Z. 2019-01-21 17:11 z
- Okay, I see. Confusion between “+ add” within the reference item, accessible while editing the parent, and “+ add reference.” —Michael Z. 2019-01-21 17:19 z
Using commons PDF as source
[edit]Hello, I would like to like to add as a source a Wikipedia Commons file, c:File:TIC TAC UFO EXECUTIVE REPORT 1526682843046 42960218 ver1.0.pdf, to coordinate location (P625) of USS Nimitz UFO incident (Q48805044). How do I do that? Int21h (talk) 01:30, 14 June 2019 (UTC)
- @Int21h: For the general data, use the information available in wikisource:
- author: George Knapp
- title: I-Team Exclusive: Confidential report analyzes Tic Tac UFO incidents
- publication date: 2018-05-18
- stated in: KLAS-TV
- Or see the section Help:Sources#Media_&_Entertainment_(TV/radio/music/video) for more information. Snipre (talk) 18:37, 17 June 2019 (UTC)
- It seems to me KLAS is acting as a publisher; the author is someone at the Department of Defense (if you belive the file). Also, "stated in" would normally be the item of a particular article, tv show, book, etc. It would not be the item for a publisher. And it has to be an item, it can't be text. If George Knapp (a TV reporter?) stated on-air what the coordinates were, you could write the citation more or less as Snipre suggests, but you're not giving a URL to view the TV story, you're giving a Commons file. The Commons file wasn't written by George Knapp and it's title isn't I-Team Exclusive: Confidential report analyzes Tic Tac UFO incidents. As far as I can tell the file is untitled. Jc3s5h (talk) 12:05, 19 June 2019 (UTC)
- I also see a "chain of custody" problem with this file. If this file were available directly from the Department of Defense, it would be somewhat reliable (with the obvious limitations of the sort of thing being described in the file). If it came from the website of KLAS-TV it would still be somewhat reliable, although a bit less so because local TV stations tend to be sensational. But this file came from Int21h. Wikimedia Commons editors are not reliable sources and we cannot trust that stuff they upload are really what they purport to be. Jc3s5h (talk) 12:19, 19 June 2019 (UTC)
Tools to generate and add references
[edit]This page has great advice, but it overlooks the fact that creating items for references is often incredibly and prohibitively tedious. Going through 10 steps to create an item for an edition to a book to cite a statement like birth date is awfully arduous, and even typing "stated in" for existing references or "reference url" time after time gets old really fast. While some bots seem good at matching and adding references to certain statements based on external identifiers, we can't rely on bots to do all human work, and humans only have so many hours in a day. Many tools (like much of Wikidata) seem like arcane secrets, known to few, mentioned only in the bowels of chat archives or nestled inconspicuously deep in sub-pages and lists, but otherwise invisible to new users. Thus I propose the tools below, and any others like them, be prominently added to the bottom of Help:Sources. Please add any additional tools/scripts, etc. so that they can more easily be found and utilized. The easier it is to create and add references, and the more people who know how to do so, the fewer unsourced statements we'll have. Cheers, -Animalparty (talk) 02:10, 9 December 2019 (UTC)
- I'm fairly new here at the moment, but this description feels very familiar. I've needed to use the chat twice this week, with the solution being non-default tools. The first guide mentioned the Merge gadget, but I didn't need to install the MergeItems-tool to use it. This guide didn't even mention the DublicateReferences-tool.
- I support adding a section (Useful editing tools?) above See also-section. But as a new user here I don't really feel comfortable editing these main guides myself. BucketOfSquirrels (talk) 16:44, 22 October 2022 (UTC)
- DuplicateReferences (enabled in Gadgets), tool to copy and paste references to other statements within an item
- DragNDrop (enabled in Gadgets), a tool to copy references to other statements, or to import statements and references from other Wikimedia sites.
- Wikidata:SourceMD, a tool for quickly generating items for scholarly articles using ISBN-13 (P212), DOI (P356), ORCID iD (P496), PubMed publication ID (P698), or PMC publication ID (P932)
- User:Matěj Suchánek/moveClaim.js, a tool to move or copy statements (and accompanying references) to other items
- BNF to WIkidata, a tool to create new Wikidata items about a person, or import new statements for existing items, with statements referenced by the Bibliothèque nationale de France, (French national Library).
- I would add:
- User:Bargioni/UseAsRef, a tool to quickly add references to other statements on the basis of external IDs or websites
- --Epìdosis 19:18, 22 October 2022 (UTC)
- I definitely second UseAsRef, I use it all the time, super easy to use! ArthurPSmith (talk) 12:31, 24 October 2022 (UTC)
- +1. Thanks for the hint. --Kolja21 (talk) 02:59, 25 October 2022 (UTC)
- I definitely second UseAsRef, I use it all the time, super easy to use! ArthurPSmith (talk) 12:31, 24 October 2022 (UTC)
- I would add:
The example reference for the newspaper/scholarly article appears to have been deleted 3 years ago
[edit]It appears that a user removed the reference for inception (P571) of Earth (Q2) in this revision. I went ahead and undid the revision, but I thought maybe since its been 3 years I'd post here in case its actually supposed to be gone. --Rampagingcarrot (talk) 00:32, 27 July 2020 (UTC)
How do I get sources to include more than just the URL for import to Wikipedia?
[edit]I'm importing a bunch of sources from Pomona College (Q7227384) to simple:Pomona College, and I'd like them to display nicely. How do I get them to include the date/page title/website/all the other normal reference fields, rather than just the URL? {{u|Sdkb}} talk 20:07, 29 July 2020 (UTC)
- @Sdkb: please have a look at Help:Sources#Web_page. Snipre (talk) 05:54, 31 July 2020 (UTC)
Editor name string
[edit]There’s an author name string (Q73980831) to enter names of authors without WD entries. How do we deal with absent editors? Just omit them? —Michael Z. 18:59, 10 December 2020 (UTC)
- In many cases it's time to create the item for the editor. If you don't want to do that you can use unknown value Help with object named as (P1932). ChristianKl ❪✉❫ 19:22, 10 December 2020 (UTC)
- Thank you. I see that for a person, subject named as (P1810) is more specific. —Michael Z. 20:49, 10 December 2020 (UTC)
- @Mzajac:. You're misunderstanding object named as (P1932) and subject named as (P1810). "Subject" here is does not refer to a person with "object" referring to an inanimate object, as in w:en:Subject and object (philosophy). Rather, "subject" refers to the Wikidata entity page on which the claim is found, and "object" refers to the value of the statement, as in w:en:Semantic triple. See Help:Statements. Daask (talk) 14:15, 9 October 2024 (UTC)
- Thank you. I see that for a person, subject named as (P1810) is more specific. —Michael Z. 20:49, 10 December 2020 (UTC)
Article with multiple editions
[edit]I am looking at The Traditional Scheme of “Russian” History (Q28703759), but this must apply to many other examples. This important article has been republished several times, including in translation. So, extending the advice on books, I suppose it should become a work item with multiple edition/translation items. Is that how library cataloguing works for articles? Should it retain published in (P1433), or is that replaced by has edition or translation (P747)? —Michael Z. 16:26, 24 December 2020 (UTC)
- Followup question: is the usual single article item a work item or an edition item, or both. —Michael Z. 16:29, 24 December 2020 (UTC)
- A related discussion regarding preprints (pre-publication versions of an article that appear online somewhere) was recently raised on Project Chat - the consensus seemed to be to NOT create multiple items, but link to the various versions with their identifiers on the main item. Not sure if this applies with translations though? ArthurPSmith (talk) 19:38, 24 December 2020 (UTC)
Vital records and other archive collections as sources
[edit]Notified participants of WikiProject Czech Republic
Hi! I need to mass change references to parish registers (of births, deaths and marriages) in more than a 1000 wikidata items (discussion) such as here due to new restrictions on stated in (P248) and would like to agree, how to proceed.
As of now, the references contain the archive name in stated in (P248). But for the last 3 weeks, it has been reported as error - because we need to state a published work, not its publisher or custodian.
Therefore, I would like to agree the standard format of archive citations with the community. My suggestion is below. The example is taken from the birth date of Karl Mikolaschek (Q97993619) using the format of Anne Eliza Back (Q42333974) as per discussion on registration district (P5564) as a template:
- reference URL (P854) = URL pointing to the resource (optional, if available) = https://www.portafontium.eu/iipimage/30065205/loket-020_2100-n
- catalog (P972) = collection or subcollection containing the record (mandatory) = Collection of Registry Books at Pilsen State Archive (Q105319092)
- registration district (P5564) = subdivision within the collection with a Wikidata item, if available = here logically "Roman Catholic Parish of Loket", but I would rather skip it here - too small, too many
- catalog code (P528) = compact resource within the collection, such as book or documentation box that does not have a wikidata item (optional) = Loket 020
- inventory number (P217) = alternative identifier within the collection (optional) = (not used here, depends on custodian if available and published)
- page(s) (P304) = page within the resource if available (optional) = 211
- title (P1476) = full archive citation or another free-form text to locate the resource offline = SOA Plzeň, Matrika narozených Loket 020, s. 211
This structure does not contain stated in (P248), which is used in citation templates on Wikipedia - we may perhaps still think of a way to include it. I have also skipped publication date (P577), because it is usually self-evident on birth and death registers, but it can perhaps be used if it adds value (e.g., on late baptisms or weddings).
There can also be variants (optional properties) depending what is available - e.g. the archived documents may be copied and published by a 3rd party such as Family Search or Matricula Online and then the source could be a digital collection such as Hungary Civil Registration, Birth and Marriage Extracts (Family Search Historical Records) (Q94425614). On the other hand, in "original" archive resources such as the Karl Mikolaschek (Q97993619), URL links may rot but the archive citation (structured or unstructured) should be valid forever, even in their study room.
Any suggestion? Can we make an official recommendation? Thanks in advance! (Also thanks to Jura and Daniel Baránek for pointing me in the right direction.) --Sapfan (talk) 13:21, 6 February 2021 (UTC)
- To me, catalog (P972) seems a little off-target. It is a work which contains a listing of items; an item might refer to a stone tool in the museum's collection. We could not use "stated in" with P792 because the catalog listing the items would only describe the item, not spell out the information within the item. Why wouldn't we use collection (P195), which seems more targeted for works (although not exclusively works)? Since it refers to the group of works, it would be legitimate to use "stated in", because the birth date of Karl Mikolaschek is indeed stated in the collection. Of course, qualifiers such as volume, publication date, and page should be added to allow a person who has physical access to the collection to find the entry for Karl Mikolaschek (or whoever). Jc3s5h (talk) 17:47, 6 February 2021 (UTC)
- Looking at it more, I really oppose using catalog. Are we thinking of a catolog of people? Or is it a catalog of people's birth date? That just doesn't seem right.
- I think there are probably three cases.
- Old records, where the paper record, whether it is an individual piece of paper or an entry in a book, is considered the original, official record. "Collection" seems to be the best for this. If the records are in the form of books, it might be useful to name not only the page, but the line(s) (P7421) within the page. Jc3s5h (talk) 18:21, 6 February 2021 (UTC)
- Publicly accessible database. An example is some vital records departments, such as Vermont, who have made some of their records through Ancestry.com and they are searchable there. We don't seem to have a property for naming a database as a reference. We probably should (or maybe we already do and I haven't found it.)
- Service by written application. This is still a collection. Access is by going to a counter in an appropriate building, and asking for a particular record. Or, one might have to fill out a request form, the format of which would be different for each institution, and receive the result either in person or by mail.
- I believe we should avoid using any property for other than it's intended purpose, because each record archive has it's own idiosyncrasies, so different editors will invent different ways to abuse the existing properties, and the end result will be an incomprehensible mess. Jc3s5h (talk) 18:21, 6 February 2021 (UTC)
- Thanks Jc3s5h for your answer. So do you suggest, mainly in your case 1, to use an archive collection as stated in (P248) and define it as collection (P195)? This would be workable, except that we would need to replace a few more properties. Would this be a workable proposal?
- reference URL (P854) = URL pointing to the resource (optional, if available) = https://www.portafontium.eu/iipimage/30065205/loket-020_2100-n
- stated in (P248) = collection or subcollection containing the record (mandatory) = Collection of Registry Books at Pilsen State Archive (Q105319092)
- volume (P478) = compact resource within the collection, such as book or documentation box that does not have a wikidata item (optional) = Loket 020
- inventory number (P217) = alternative identifier within the collection (optional)
- page(s) (P304) = page within the resource if available = 211
- line(s) (P7421) = row within a page or document, if numbered
- title (P1476) = full archive citation or another free-form text to locate the resource offline = SOA Plzeň, Matrika narozených Loket 020, s. 211
Thanks, --Sapfan (talk) 19:21, 6 February 2021 (UTC)
- That looks good to me. Jc3s5h (talk) 19:51, 6 February 2021 (UTC)
- Thanks, Jc3s5h! Any other opinion? Is anyone against using this as a template for a mass change of existing records? --Sapfan (talk) 08:00, 7 February 2021 (UTC)
- Thanks for looking into this. Were you going to use "collection" or P248? Now the above shows P248. BTW, there are also folio(s) (P7416) and column (P3903) when one wants to specify. Also, I like the possible inclusion of registration district (P5564). I will try to check a few samples to see how it can be improved. In the meantime, don't hesitate to proceed based on a provisional version. --- Jura 08:54, 7 February 2021 (UTC)
- A few other thoughts: when explaining this approach, the language of the explanation and the language of the source should be the same (in this case, English). Also, the location of the archive should be in a country that speaks the same language as the explanation.
- For the title (P1476), a better explanation than "full archive citation or another free-form text to locate the resource offline" is needed. When writing citations according to a style manual such as Chicago Manual of Style, it is common to substitute a description of a work if the work does not have a title. But maybe there is a property available for describing a title-less work. If so, I don't know what it is. When using a procedure such as those in Chicago Manual of Style, a title of a large work is in italics, a small work is enclosed in double quotes, and a description has no special treatment. Those options are not available to us with title (P1476).
- As for volume (P478), record books are sometimes numbered in sequence without regard to the contents, in which case just a number would be appropriate. In other archives, the record books for different purposes may be numbered separately. For example, Births 5 might cover a date range that is later than Marriages 2. Jc3s5h (talk) 09:16, 7 February 2021 (UTC)
- I don't think we only want samples from English speaking countries in the English version. Help is meant to help users solve actual questions when contributing. There is no need to address all questions at once. --- Jura 09:22, 7 February 2021 (UTC)
- Hi Jura and Jc3s5h, thanks for the further suggestions! I am choosing a resource in Czech language, because this is the most typical example I work with. (In the US, Australia or perhaps GB, the format of vital records tends to be different - more database queries or 3rd party copies such as Family Search or Geni.com.) So let me update the proposal again - hopefully it will be more culture-neutral now:
- Thanks, Jc3s5h! Any other opinion? Is anyone against using this as a template for a mass change of existing records? --Sapfan (talk) 08:00, 7 February 2021 (UTC)
- That looks good to me. Jc3s5h (talk) 19:51, 6 February 2021 (UTC)
Property | Explanation | Example (Czech with URL target in German) | Example (Austrian) |
Example | Date of birth of the following personalities: | Hermann Mattkey (Q96694727) | Josef J. Zapf (Q96901921) |
reference URL (P854) | URL pointing to the resource (if available) | https://vademecum.soalitomerice.cz/vademecum/permalink?xid=09ddd7cea03b9b8d:4e496e4e:12216bae987:-6c7c&scan=50820182bd164b85b51b9abef3c4412e | https://data.matricula-online.eu/de/oesterreich/wien/07-schottenfeld/01-043/?pg=273 |
stated in (P248) | collection or subcollection containing the record (mandatory); it can be the original archive or a 3rd party database | Collection of Registry Books at Litoměřice State Archive (Q105319095) | Matricula Online - Archdiocese Vienna (Q105357637) |
registration district (P5564) | regional subdivision of the database, if created in Wikidata | parish Vienna 07, Schottenfeld (Q105357856) | |
volume (P478) | compact resource within the collection, such as book or documentation box that does not have a wikidata item | 118/6 | Taufbuch 01-043 (if registration district (P5564) missing, then: 07. Schottenfeld, Taufbuch 01-043) |
inventory number (P217) | alternative identifier within the collection, if provided | 5500 | |
page(s) (P304) | page within the resource if available | 317 | 237 |
folio(s) (P7416) | use instead of page(s) (P304), if more appropriate | ||
line(s) (P7421) | row within a page or document, if numbered | ||
title (P1476) | title of the resource or a free-form text to identify and locate it offline | SOA Litoměřice, Matrika narozených N • inv. č. 5500 • sig. 118/6 • 1797 - 1823 • Most, Zahražany, s. 317 | Matricula Online, Rk. Erzdiözese Wien, 07. Schottenfeld, Taufbuch 01-043, S. 237 |
Can this become an official recommendation, or should we adjust anything else? Thanks! --Sapfan (talk) 12:15, 7 February 2021 (UTC)
- I think the main question is if we should use stated in (P248).
- section, verse, paragraph, or clause (P958) is used for unformatted references in other works. Not sure if this could work here or another one is needed.
- If you go for exhaustiveness, maybe you want to include type of reference (P3865) as well. --- Jura 08:53, 9 February 2021 (UTC)
- I think stated in (P248) is used to specify the work or collection that makes the statement that we have repeated in Wikidata. This work can be further qualified with section, verse, paragraph, or clause (P958) if that is the way the work is organized, although a page number, or image on a microfilm, would be more typical for an archive of vital records. It's important to notice that stated in (P248) takes an item as its value and section, verse, paragraph, or clause (P958) takes a text string as its value. Jc3s5h (talk) 19:28, 9 February 2021 (UTC)
- The question is if it should be stated in (P248) or some other item-based property.
- section, verse, paragraph, or clause (P958) (or some other property) should be compared with title (P1476). These capture what isn't done by other properties in a formatted way. --- Jura 08:07, 10 February 2021 (UTC)
- I think stated in (P248) is used to specify the work or collection that makes the statement that we have repeated in Wikidata. This work can be further qualified with section, verse, paragraph, or clause (P958) if that is the way the work is organized, although a page number, or image on a microfilm, would be more typical for an archive of vital records. It's important to notice that stated in (P248) takes an item as its value and section, verse, paragraph, or clause (P958) takes a text string as its value. Jc3s5h (talk) 19:28, 9 February 2021 (UTC)
- @2le2im-bdc: maybe you want to comment as you wrote some/most of Wikidata:WikiProject Archival Description/Data structure that looks at the same information from a different angle. --- Jura 08:53, 9 February 2021 (UTC)
- Page falls below folio in the hierarchy, since a folio, or leaf, has two pages, recto and verso.
- In some places in Eastern Europe there is an archive structure that I’ve seen in Ukrainian sources: фонд (fond, “fund” or “fonds”), опис (opys, “description” or “account”), справа (sprava, “case” or “topic,” =file unit (Q59221146)?), аркуш (arkush, “folio” or “sheet,” =folio(s) (P7416)). Elsewhere I’ve seen reel for microfilm.
- I have no idea what else this corresponds to, but it would be helpful to suggest a consistent way to enter such data. —Michael Z. 17:31, 9 February 2021 (UTC)
- Hello @Sapfan, Jc3s5h, Jura1:. Thanks a lot for your work. I just notice that it's existing a model to make a reference to a database through value from external ID. Perhaps a way to prefer for the online catalogue with persistent identifiers. --2le2im-bdc (talk) 20:22, 14 February 2021 (UTC)
6I think it makes a difference what kind of database it is. The database might contain the information we are interested in, such as date of birth, directly. Or, the database might be an index to the holdings of an institution, for example, FamilySearch Historical Records (Q94420095) might tell us that a person with the same name as the person described in a Wikidata item is mentioned on a certain image of a certain microfilm, which is held in the Family History Library (Q4565916), but to find the birth date you would have to visit the library in person and view the microfilm. Or, a database might describe a publication that is not available from the organization that runs the database, and finding the publication would be a completely separate operation from reading the database entry. Jc3s5h (talk) 20:50, 14 February 2021 (UTC)
- Just for information, I have build (with the important help of the wikidata's community) a SPARQL query to find people with no reference on their date of death. It could be modified for each place, for each date, for birth or marriage. It could give a strategic plan to complete it.--2le2im-bdc (talk) 21:05, 16 February 2021 (UTC)
- Interesting activity! Actually, I am doing the opposite - manually scanning through selected (Czech) death registers page by page, entering references and sometimes even missing dates + places into Wikidata. It goes slowly, I have so far managed only about 1000 records. See a sample. Once we have more, it would be interesting to compare with your tool, how many records from specific cities, time periods etc. we are still missing. But let's agree on the formats first - this is how I got to this discussion! --Sapfan (talk) 13:14, 20 February 2021 (UTC)
- @Sapfan, Jc3s5h, Jura1:. After a short discussion on Twitter with french speacking archivists, we have choose several possibilities to cite civil registry as references. More details on Archival Description Project. --2le2im-bdc (talk) 20:09, 18 February 2021 (UTC)
- Thanks for looking into this. I like the outline as intro. However, we should attempt to consolidate this into Help:Sources, information elsewhere shouldn't contradict this page. As for the proposal there, I'm not sure if stated in (P248) should have classes as value. We did create type of reference (P3865) for the purpose you proposed there. Not sure about the use of inventory number (P217), but maybe Wikidata:Property_proposal/record_number finally isn't needed. --- Jura 20:26, 18 February 2021 (UTC)
- Hi @2le2im-bdc:! Sorry for not following this discussion for some time and thank you for preparing and linking the French archive description format with example (Camille Moreau-Nélaton (Q13080821)). I like the idea, but have a few comments:
- Just for information, I have build (with the important help of the wikidata's community) a SPARQL query to find people with no reference on their date of death. It could be modified for each place, for each date, for birth or marriage. It could give a strategic plan to complete it.--2le2im-bdc (talk) 21:05, 16 February 2021 (UTC)
- I think we should distinguish birth/marriage/death certificates (birth certificate (Q83900), marriage certificate (Q1299632), death certificate (Q708653)), which are standalone documents issued to parties, from registry books (vital record (Q18562479) with subclasses birth registry (Q11971341) (or baptism registry (Q28369847)), death registry (Q12029619) and marriage registry (Q14324227)). Actually, the death of Camille Moreau-Nélaton (Q13080821) is referenced by death registry (Q12029619), not death certificate (Q708653) - unlike this example - and the link points to a specific page, so it should be a reference URL (P854) rather than full work available at URL (P953). Do you agree?
- It is a little pity, that you use collection (P195) instead of stated in (P248), which is more widely used in templates. E.g. in Czech Wikipedia, references without collection (P195) are frowned upon. Should we duplicate the same value in both properties? That is inconvenient.
- So, what about the following guidelines:
- The purpose is to identify a document in a way, that researchers can locate it online or offline, even if its presentation changes and URL links rot. The choice of properties should reflect the standards set by the creator or custodian of the collection.
- Archived documents and database entries should be identified by:
- collection (P195). The same value may be repeated as stated in (P248) for a better usability in templates. => Controversial, to be agreed if we want it!
- type of reference (P3865) such as birth registry (Q11971341), baptism registry (Q28369847)), death registry (Q12029619), marriage registry (Q14324227), birth certificate (Q83900), marriage certificate (Q1299632), death certificate (Q708653)
- reference URL (P854) if a URL link points to a specific page, full work available at URL (P953) if user needs to navigate further
- title (P1476) with archive citation or another free text to help a researcher locate the resource offline or in a changed online presentation format
- and preferably also by additional structured characteristics, such as: registration district (P5564), volume (P478), inventory number (P217), folio(s) (P7416), page(s) (P304) (from printed original), file page (P7668) (from scanned copy), line(s) (P7421) and/or country-specific ones such as Archival Resource Key (P8091) (French), if provided.
We will never have one format worldwide, because there are big format differences. So my best advice would be, to strongly suggest a few properties and give a list of many other potentially useful ones to choose from, if suitable. Would this be a way forward?
Thanks to all, especially Jura, for keeping the discussion running.--Sapfan (talk) 13:14, 20 February 2021 (UTC)
- Sapfan wrote "I think we should distinguish birth/marriage/death certificates (birth certificate (Q83900), marriage certificate (Q1299632), death certificate (Q708653)), which are standalone documents issued to parties, from registry books (vital record (Q18562479) with subclasses birth registry (Q11971341) (or baptism registry (Q28369847)), death registry (Q12029619) and marriage registry (Q14324227))."
- In Wikidata, the various certificates are not clearly defined. From my experience researching ancestors in the US, I see the situation with certificates has changed over time. Early on (maybe 1820 or so) the original record of a vital event would be entered in a register book by a municipal clerk or religious minister. If it was a municipal record, and still survives, the municipality did, and even today still does, upon the request of a member of the public who pays the required fee, transcribes the details onto a certificate and issues the certificate to the requester.
- Later, starting in perhaps 1860, a person concerned with an event would fill out a piece of paper, which might be called a "Certificate of Live Birth", "Death Certificate", "Marriage License", etc. and file it with the municipality. The municipality would preserve the original record, and issue certified copies upon request to members of the public. The municipality, and perhaps the state, would prepare indicies of these original documents. These indicies might or might not contain all the details in the original document.
- Beginning in the late 20th century, persons concerned with births and deaths such as hospital clerks and funeral directors were given electronic access to the state vital records department, and became able to enter birth and death event records directly into the state vital records database(s). Now, the original record is a database entry. Paper certificates can be issued to eligible members of the public upon request and payment of fees. Because marriages are officiated by a large number of ministers, rabbis, mullahs, justices of the peace, etc., it is not feasible to give all these individuals access to state databases, so generally the officiant fills spaces on the paper marriage license and returns it to the municipality. (As a justice of the peace, I have done this myself.)
- I'm not sure how important it is, in Wikipedia, to distinguish among these types of records. Jc3s5h (talk) 19:06, 20 February 2021 (UTC)
- Hi Jc3s5h, thanks for this insight. At least in Europe, there has traditionally been (and probably still is) a book with records kept by the municipality, supported by a collection of original documents (e.g. from hospitals). These documents are later archived. Out of the register book, the authorities issue certificates to involved parties. But it can be different in the US or other regions - which can lead to further record types. Regarding your comment "I'm not sure how important it is, in Wikipedia, to distinguish among these types of records.": It probably does not help a researcher much, it is just extra information. I can imagine references without type of reference (P3865). But if we fill it, then let's do it correctly. In many (if not all) countries, there is a conceptual difference between a public registry and a certificate issued out of it. OK? --Sapfan (talk) 19:42, 20 February 2021 (UTC)
- Yes, there is a conceptual difference between a public registry and a certificate issued out of it, even though sometimes the latter is a photocopy of the former. And it may, or may not, be possible for a researcher to go into the vault and personally examine the original record. Jc3s5h (talk) 19:53, 20 February 2021 (UTC)
- Hello @Sapfan, Jc3s5h:. To clarify the terms of the discussion : Act of civil registry are part of the civil registry and received inside it a number. Certificat are copy of this acts. There is perhaps a crosslingual difficulty. The Wikidata item birth certificate (Q83900) refer in French in the first time to the act but also in alias to the certicicat, and in English only to the certificat. Act of birth don't exist in English in Wikidata. Is it so important, we have do make two separated items? Poke @VIGNERON:. --2le2im-bdc (talk) 21:19, 20 February 2021 (UTC)
- Thanks a lot @Jura1: for the distinction between stated in (P248) and type of reference (P3865). It's make sense indeed. And I find your "Record Number" Proprety Proposal very interesting in particular to precise the number of the act in the civil registry--2le2im-bdc (talk) 21:40, 20 February 2021 (UTC)
- Record number can indeed be the best reference, if available. But older records usually do not have any and we only have archive -> book ID -> page, scanned directly from the registry or written as source on a certificate. I would put it as an optional parameter - "fill if you can". 2le2im-bdc: I agree that both a registry and a certificate from it are essentially the same source. But if we have separate data items for them, I would suggest to use them. It can be country- and source-specific, similar to everything else we talk about here. Each creator, custodian and presenter has its own standards of document identification. --Sapfan (talk) 21:54, 20 February 2021 (UTC)
- Hello @Sapfan, Jc3s5h:. To clarify the terms of the discussion : Act of civil registry are part of the civil registry and received inside it a number. Certificat are copy of this acts. There is perhaps a crosslingual difficulty. The Wikidata item birth certificate (Q83900) refer in French in the first time to the act but also in alias to the certicicat, and in English only to the certificat. Act of birth don't exist in English in Wikidata. Is it so important, we have do make two separated items? Poke @VIGNERON:. --2le2im-bdc (talk) 21:19, 20 February 2021 (UTC)
- Yes, there is a conceptual difference between a public registry and a certificate issued out of it, even though sometimes the latter is a photocopy of the former. And it may, or may not, be possible for a researcher to go into the vault and personally examine the original record. Jc3s5h (talk) 19:53, 20 February 2021 (UTC)
- Hi Jc3s5h, thanks for this insight. At least in Europe, there has traditionally been (and probably still is) a book with records kept by the municipality, supported by a collection of original documents (e.g. from hospitals). These documents are later archived. Out of the register book, the authorities issue certificates to involved parties. But it can be different in the US or other regions - which can lead to further record types. Regarding your comment "I'm not sure how important it is, in Wikipedia, to distinguish among these types of records.": It probably does not help a researcher much, it is just extra information. I can imagine references without type of reference (P3865). But if we fill it, then let's do it correctly. In many (if not all) countries, there is a conceptual difference between a public registry and a certificate issued out of it. OK? --Sapfan (talk) 19:42, 20 February 2021 (UTC)
It seems we have several related items:
- register (Q19386377) seems to refer to an actual book, collection of original certificates, or database
- register office (Q745221) a government office where vital events are recorded
- civil registration (Q83708009) not sure what this is.
Jc3s5h (talk) 00:01, 21 February 2021 (UTC)
- Hello @Sapfan:. There many questions open. I go slowly one after one. In my point of view, collection (P195) and stated in (P248) have not the same object. The first make a connexion to the institution and/or the wide collection, the second is connecting the document or the series of documents. You speak about template in Wikipedia, could you give us some example? Thanks in advance --2le2im-bdc (talk) 22:22, 21 February 2021 (UTC)
- Hello @2le2im-bdc:. Sorry, I overlooked the notice. An example on Czech Wikipedia is template Infobox - person, used e.g. here. It contains a row Relatives (Příbuzní), which itself comes from another template, Relatives from WD. I am not exactly sure how they do it technically, but the result is as in the linked article - (at least) grandchildren are supplied with references, which contain reference URL (P854) and stated in (P248). Thanks! --Sapfan (talk) 08:38, 27 February 2021 (UTC)
- Thanks @Sapfan:. The technical mechanism to implement reference from Wikidata in a WP page is very interesting. I will propose to add it in the use of archives at (P485) in Bibliographie2 Infobox in WP in French. However, I think that stated in (P248) is not the right property in the case of father (P22) of
Božena Auštěcká (Q95386964). Prague City Archives (Q19672898) is not a collection. It's a institution. --2le2im-bdc (talk) 20:33, 27 February 2021 (UTC)
- Thanks @2le2im-bdc:! I agree with your last comment - this is why we have this discussion at all. There is a plan to mass replace institutions with collections, in this case Prague City Archives (Q19672898) with Collection of Registry Books at Prague City Archives (Q105319160), but still within the property stated in (P248) - to keep it displayed in templates such as this one. But yes, we can also include Prague City Archives (Q19672898) in archives at (P485). Can we now finalize the template? --Sapfan (talk) 22:17, 27 February 2021 (UTC)
- Yes of course @Sapfan: ! The only modification from your previous template are to add stated in (P248) and section, verse, paragraph, or clause (P958). Archival Resource Key (P8091) can be also be mentionned. For the exemple : given name (P735) of
Victor Hugo (Q535). Thanks a lot for all your work! --2le2im-bdc (talk) 20:50, 8 March 2021 (UTC)
- Hi @2le2im-bdc: (and @Jc3s5h, Jura1: Notified participants of WikiProject Czech Republic), thanks for your feedback! I have just tried to summarize the discussion on the official page. Hope it is not premature. Feel free to edit further as needed, e.g. if I missed something. Once there is a stable version for about a week, I will ask for a mass change of existing records. Thanks to all for taking part in the discussion! --Sapfan (talk) 21:13, 9 March 2021 (UTC)
- Great @Sapfan:! I have publicized the result on a post on Twitter! --2le2im-bdc (talk) 19:27, 10 March 2021 (UTC)
- Yes of course @Sapfan: ! The only modification from your previous template are to add stated in (P248) and section, verse, paragraph, or clause (P958). Archival Resource Key (P8091) can be also be mentionned. For the exemple : given name (P735) of
Victor Hugo (Q535). Thanks a lot for all your work! --2le2im-bdc (talk) 20:50, 8 March 2021 (UTC)
- Thanks @2le2im-bdc:! I agree with your last comment - this is why we have this discussion at all. There is a plan to mass replace institutions with collections, in this case Prague City Archives (Q19672898) with Collection of Registry Books at Prague City Archives (Q105319160), but still within the property stated in (P248) - to keep it displayed in templates such as this one. But yes, we can also include Prague City Archives (Q19672898) in archives at (P485). Can we now finalize the template? --Sapfan (talk) 22:17, 27 February 2021 (UTC)
- Thanks @Sapfan:. The technical mechanism to implement reference from Wikidata in a WP page is very interesting. I will propose to add it in the use of archives at (P485) in Bibliographie2 Infobox in WP in French. However, I think that stated in (P248) is not the right property in the case of father (P22) of
Božena Auštěcká (Q95386964). Prague City Archives (Q19672898) is not a collection. It's a institution. --2le2im-bdc (talk) 20:33, 27 February 2021 (UTC)
- Hello @2le2im-bdc:. Sorry, I overlooked the notice. An example on Czech Wikipedia is template Infobox - person, used e.g. here. It contains a row Relatives (Příbuzní), which itself comes from another template, Relatives from WD. I am not exactly sure how they do it technically, but the result is as in the linked article - (at least) grandchildren are supplied with references, which contain reference URL (P854) and stated in (P248). Thanks! --Sapfan (talk) 08:38, 27 February 2021 (UTC)
- Hello @Sapfan:. There many questions open. I go slowly one after one. In my point of view, collection (P195) and stated in (P248) have not the same object. The first make a connexion to the institution and/or the wide collection, the second is connecting the document or the series of documents. You speak about template in Wikipedia, could you give us some example? Thanks in advance --2le2im-bdc (talk) 22:22, 21 February 2021 (UTC)
- I'm not really convinced by the use of title (P1476). Everywhere else this is the actual title of a work. Why would it change merely because it's held by an archive? --- Jura 06:17, 15 March 2021 (UTC)
- Thanks for your comment, Jura. I get your point (it is an abuse of a field with a different purpose), but: 1. archive resources often do not have fixed titles (it is not a published book, it is a unique document) and 2. we need a free field to place a description (else the information may get lost if the URL changes - unless we are sure the other elements are sufficient). Is there any other suitable field? --Sapfan (talk) 07:19, 15 March 2021 (UTC)
- As for stated in (P248), I don't think we should use that merely because some template in Wikipedia doesn't support a possibly more appropriate property. collection (P195) seems more suitable given the values used. --- Jura 06:17, 15 March 2021 (UTC)
- I get your point, but on the other hand, citation templates should be able to work with many types of references (websites, books, magazines, documents...) and if each uses a different set of properties, then they become big and complex. At Czech Wikipedia, I got a strong recommendation to always include stated in (P248). But you are right, archive collections are actually collection (P195)-s. Should we then use both? Or reserve stated in (P248) for "anything else but archives"? (That would probably not make much sense). --Sapfan (talk) 07:19, 15 March 2021 (UTC)
- P248 is important, but I don't think it's always applicable. For a different matter, we created applicable 'stated in' value (P9073) to facilitate this.
- There are also title of broader work (P6333) and published in (P1433) for other cases. When these are used, wouldn't stated in (P248) be omitted? --- Jura 13:12, 15 March 2021 (UTC)
- Thanks for an introduction to a new range of properties, which I see for the first time. But don't they have other special purposes? E.g. applicable 'stated in' value (P9073) is clearly a "property of a property" (to enforce a specific value of stated in (P248) if another property is assigned). As for title of broader work (P6333), it is a free field to be used if the higher level work does not have an item (creation date of Rosenborg Tapestries (Q12333754)). And published in (P1433) in references (e.g., date of death of Isaac Komnenos (Q246434)) is to me clearly synonymous with stated in (P248); published in (P1433) should in my view mainly be used to build hierarchies (e.g. article -> issue -> volume -> magazine title), but "we have democracy" and people of course choose what they like or know. To summarize: I still think that stated in (P248) is the best choice for references if the source - on whatever level (collection, magazine, book, or specific chapter / document / article) has a WD item. The other ones can help in specific cases (e.g. title of broader work (P6333) if no item exists or published in (P1433) to organize published works). Does it make sense? --Sapfan (talk) 17:40, 15 March 2021 (UTC)
Using reference for archive info in URL statements
[edit]In relation to an URL statement, would be adding a reference with archive-url and archive-date ok? Or is this an unacceptable repurposing? This usage is because adding a qualifier with archive-url is not enough, it is also needed archive-date. AFAIK, only a reference can contain both while keeping linked to the URL statement. --TRANSviada (talk) 21:02, 25 February 2021 (UTC)
Clarification at "Databases" when used on items with identifier in external-id
[edit]At Help:Sources#Databases, "title" seems useful mainly if the external-id property value isn't already in the identifier section of the item. In that case, a qualifier should be added to that statement instead. I clarified the first point on the page. --- Jura 09:20, 28 September 2021 (UTC)
- Not happy with this one, although I understand your motivation to reduce duplication.
- IMO reference usage should not require data users to search at various places for potentially useful information, thus the title of the referenced work should not be placed elsewhere as well. References only work if they are self-contained to high degree.
- Consider data usage in Wikipedia, for instance, with a display of the reference as given in Wikidata. Typically, the bare minimum information to compile a useful reference would be URL (to be compiled from identifier and its formatter URL), a title of the referenced work, and a retrieval date. Besides the identifier qualifier, you need retrieved (P813) and title (P1476).
- Please also mind that "title" qualifier use in identifier claims is not very consistent. As much as I am aware, there are at least title (P1476), subject named as (P1810) and sometimes even object named as (P1932) in use.
- The stated in (P248) value of a reference, which may be redundant itself due to applicable 'stated in' value (P9073) on the identifier property, does of course potentially contain further valuable information—but if it is missing, it does not result in a bad reference. —MisterSynergy (talk) 11:22, 28 September 2021 (UTC)
- Yes applicable 'stated in' value (P9073) should be able to provide for missing stated in (P248). There are cases where stated in (P248) is still needed, but maybe we can do a way with it in general (for now, I merely mentioned it, see #Mention_applicable_'stated_in'_value_(P9073) below).
- The value of either property should also provide title and further information about the work. The id-property than provides the exact location in that work. For most if not all citations that should be sufficient. I hope we wont end up creating an item for each ID value.
- Agree that some harmonization of the qualifiers in the identifiers section of items would be helpful, but this isn't directly covered by this page. --- Jura 11:38, 28 September 2021 (UTC)
Clarification about retrieved date or published date
[edit]There are few statements that include current data and qualify the date with point in time (P585). Sample properties: social media followers (P8687), review score (P444), etc. For these, adding publication date (P577) or retrieved (P813) would be redundant. I clarified that at Help:Sources#Databases. --- Jura 09:20, 28 September 2021 (UTC)
Clarification at Web page
[edit]At Help:Sources#Web_page, it seems it may not clear that if there is a Wikidata property for a website, Help:Sources#Databases would be the sample to follow. I added a clarification about that.
I don't think we have any properties that aren't for web-exposed databases anyway. --- Jura 09:20, 28 September 2021 (UTC)
Mention applicable 'stated in' value (P9073)
[edit]applicable 'stated in' value (P9073) wasn't mentioned yet on the page. It simplifies determining the value for stated in (P248). I added that at Help:Sources#Databases. --- Jura 09:20, 28 September 2021 (UTC)
Main types of statements
[edit]Maybe it's worth mentioning that Wikidata has (now) three types of statements: facts about a subject, mappings of external identifiers and mappings of Wikimedia Commons media. Also mappings of Wikipedia/Wikimedia pages, but these are generally not added as statements. The guideline is useful mainly for the first group. Maybe a fourth groups could be "category of" links. --- Jura 10:47, 4 November 2021 (UTC)
Update the part about imported from Wikimedia project (P143)
[edit]The part about P143 wasn't necessarily representative of current pratice. I updated the part. --- Jura 10:47, 4 November 2021 (UTC)
Heuristic references
[edit]Should we have some section in this help page that explains when and how to use hueristics as references (list of heuristics)? I think it's probably one of the more nuanced cases where you might infer a publication's main subjects from its abstract (example), or a person's nationality from their profession (example). Inferring gender from images or names probably deserves a comment as well. T.Shafee(evo&evo) (talk) 05:20, 5 October 2022 (UTC)
- Note, we do have a few examples at the property 'based on heuristic' (P887), but some written guidance would be useful for newer users. T.Shafee(evo&evo) (talk) 23:27, 5 October 2022 (UTC)
How to use Source MetaData tool?
[edit]The help page says that I can use the Source MetaData tool to import a scientific article metadata when I know its DOI. However, when I enter a DOI value and hit "Check/Add papers", it returns "Not logged in!" What am I doing wrong? I cannot find any login link on the tool's page. Роман Рябенко (talk) 21:33, 10 November 2022 (UTC)
- @Роман Рябенко: Not sure if it helps but try logging in to QuickStatements first. --Matěj Suchánek (talk) 19:39, 12 November 2022 (UTC)
- @Matěj Suchánek, thank you for trying to help! I logged into QuickStatements but I still get the same message with Source MetaData tool. I looked into "Show latest batches" on the same page and it appears that the latest submissions were in March 2020. It looks like the tool is not used anymore. Роман Рябенко (talk) 23:02, 14 November 2022 (UTC)
- Well, Wikidata:SourceMD suggests using the old version. --Matěj Suchánek (talk) 16:50, 16 November 2022 (UTC)
- Thank you! I confirm that it works great. I think the external link to Source MetaData tool in Help:Sources#Scientific, newspaper or magazine article should be replaced with the internal link to Wikidata:Source MD, so that contributors had a chance to see that notice in the top of the page. Роман Рябенко (talk) 23:54, 24 November 2022 (UTC)
- Well, Wikidata:SourceMD suggests using the old version. --Matěj Suchánek (talk) 16:50, 16 November 2022 (UTC)
- @Matěj Suchánek, thank you for trying to help! I logged into QuickStatements but I still get the same message with Source MetaData tool. I looked into "Show latest batches" on the same page and it appears that the latest submissions were in March 2020. It looks like the tool is not used anymore. Роман Рябенко (talk) 23:02, 14 November 2022 (UTC)
Broken Example, proposed fix.
[edit]At Help:Sources#When_to_source_a_statement, in point 2, Harry Potter and the Philosopher's Stone (Q43361)'s GND ID (P227) serves as an poor/invalid example of a statement that does not need a source, because it has a source. Should we delete the source? imported from German Wikipedia doesn't strike me as a verifiable source (too vague), so that's 2 reasons to remove it. I plan to, shortly. RudolfoMD (talk) 22:25, 22 October 2023 (UTC)
- Support ArthurPSmith (talk) 17:58, 24 October 2023 (UTC)
- Thanks. Deleted. RudolfoMD (talk) 20:41, 25 October 2023 (UTC)
Point one says the obvious fact that books might be interwiki-linked to Wikisource. As for the second point, I doubt Wikisource has texts that are not egible have a Wikidata item; therefore, I think, this point is also redundant. I propose removing this subsection. Janhrach (talk) 08:58, 15 June 2024 (UTC)
- As for the second, it discusses subpages. s:The Curse of Capistrano has a Wikidata item, but I don’t think s:The Curse of Capistrano/Chapter 11 merits its own item, so you have to use reference URL (P854)https://en.wikisource.org/wiki/The_Curse_of_Capistrano/Chapter_11 in addition to stated in (P248)The Curse of Capistrano (Q639072). —Tacsipacsi (talk) 13:54, 16 June 2024 (UTC)
- @Tacsipacsi: In your example, a more semantically clear and machine-readable approach would be to use chapter (P792) – this is also supported by this help page. My argument (I doubt Wikisource has texts that are not egible have a Wikidata item) therefore also applies to split-up texts. But thank you for the clarification. Janhrach (talk) 15:30, 21 June 2024 (UTC)
- @Janhrach: chapter (P792) augments reference URL (P854) nicely, but it doesn’t substitute it – from stated in (P248)The Curse of Capistrano (Q639072) and chapter (P792)11, it cannot be automatically decided if the link should point to s:The Curse of Capistrano/Chapter 11, s:The Curse of Capistrano/11 or something else. —Tacsipacsi (talk) 23:53, 21 June 2024 (UTC)
- In my understanding, whether the cited work is available on Wikisource is not relevant to the citation itself, but it may be relevant for the reader (but is not relevant for, e.g, a bot using Wikidata). But this guideline does not recommend adding reference URL (P854) for citations citing a chapter of a work with a Q-identifier – the only exception is "citing Wikisource".
- Say the chapter is also published online elsewhere. To add every such location to the reference would be very impractical – this guideline does not even mention this option. Wikisource is treated differently (i.e. it may be be linked in the reference URL), which is understandable, but not correct in my opinion. Janhrach (talk) 08:42, 22 June 2024 (UTC)
Say the chapter is also published online elsewhere. To add every such location to the reference would be very impractical – this guideline does not even mention this option.
- Why would it be impractical? You’ve convinced me that it’s not Wikisource that requires special treatment – rather, any books that are available online, because such links are useful for all of them. —Tacsipacsi (talk) 10:40, 23 June 2024 (UTC)
- Because for more known works, there could be tens of websites where the work is published. These could be indeed useful, but they would, in my opinion, clutter Wikidata references. It would be convenient to create a separe item for the chapter cited, and add the URLs into that item.
- As for sources that are not cited frequently on Wikidata, you are right it would be more convenient to add the URL directly into references.
- But this creates another question: How much should Wikidata editors strive to add all available online locations into references (or create chapter items, as I suggested)? A lot of time would be saved if we retained the current way of citing books (readers need to go to the Wikidata page of the cited item, pick an online source, and search for the correct chapter on the external website). As you said, this is inconvenient. But it is more inconvenient for Wikidata to add chapter URLs to every single reference. Janhrach (talk) 07:28, 29 June 2024 (UTC)
Because for more known works, there could be tens of websites where the work is published. These could be indeed useful, but they would, in my opinion, clutter Wikidata references.
- I wouldn’t add more than one; given how citation templates on wikis work, more than one URL wouldn’t appear anyway.
A lot of time would be saved if we retained the current way of citing books (readers need to go to the Wikidata page of the cited item, pick an online source, and search for the correct chapter on the external website).
- For whom? Yes, editors would save time by deferring the work of finding the chapter to readers; readers, on the other hand, would lose time. However, I don’t want to force editors into adding the URLs – if someone takes that extra step, it’s welcome, if someone doesn’t, it’s okay. —Tacsipacsi (talk) 07:36, 1 July 2024 (UTC)
- @Janhrach: chapter (P792) augments reference URL (P854) nicely, but it doesn’t substitute it – from stated in (P248)The Curse of Capistrano (Q639072) and chapter (P792)11, it cannot be automatically decided if the link should point to s:The Curse of Capistrano/Chapter 11, s:The Curse of Capistrano/11 or something else. —Tacsipacsi (talk) 23:53, 21 June 2024 (UTC)
- @Tacsipacsi: In your example, a more semantically clear and machine-readable approach would be to use chapter (P792) – this is also supported by this help page. My argument (I doubt Wikisource has texts that are not egible have a Wikidata item) therefore also applies to split-up texts. But thank you for the clarification. Janhrach (talk) 15:30, 21 June 2024 (UTC)
Okay, so if at most one URL is to be added to a references, I am not opposed. Janhrach (talk) 06:22, 13 July 2024 (UTC)
Source administrative documents
[edit]Hi,
I have a question about administrative documents. I have tables full of figures that were drafted by a Parliament. However, the documents are not directly available from the Parliament itself; instead, they were retrieved and published by an NGO.
When adding the documents as reference for the figures, what is the proper way to label them? Of course, I add the document's URL as reference URL (P854). But then, should the Parliament be publisher (P123)? And, if so, what should the NGO be? Or should the Parliament be author (P50) and the NGO publisher (P123)? Or something else entirely?
Thanks! Julius Schwarz (talk) 13:32, 24 September 2024 (UTC)
- Anyone? Julius Schwarz (talk) 07:09, 1 October 2024 (UTC)
- I can't say I have a definitive answer, but your second suggestion (parliament as author, NGO as publisher) seems reasonable. ArthurPSmith (talk) 13:54, 1 October 2024 (UTC)
- Thanks for the feedback! Julius Schwarz (talk) 14:03, 1 October 2024 (UTC)
- I can't say I have a definitive answer, but your second suggestion (parliament as author, NGO as publisher) seems reasonable. ArthurPSmith (talk) 13:54, 1 October 2024 (UTC)
Add recommendation to avoid redundant references
[edit]I have a suggestion. Under When to source a statement, after the list of "In some cases sources are not required:" I suggest that we add a new paragraph.
Also, it's recommended to not add a second reference unless it is based on an independent source (for instance, not just are different databases based on the same primary source). An acceptable exception to this is if the new reference adds a missing quality to the source, like machine-readability or being open access.
The reason for doing this is to avoid redundancy. An extreme example is Q1237312#P569 with 15 references, but even with just two, it is generally adding clutter instead of value. Possible, we would also give guidance on how to signal these qualities, but I don't know what would be the best way to do it. Ainali (talk) 10:58, 17 October 2024 (UTC)