Metadata standards and models | IHSN (2024)

The XML Language

eXtensible Markup Language, or XML, was developed as a common tool to structure information to be shared on the Web and between software systems. XML is a way of tagging text for meaning instead of appearance, i.e., XML can organize the content of text by tagging it with meaningful information. Although the "tags" are conceptually the same as the "fields" in a database in terms of organization, the difference between XML and database files is that the former are regular text files which can be viewed and edited using any standard text editor. The file can be searched and queried like a regular database using tools like Xpath or Xquery, and edited using Xforms. (A web-based tutorial on these tools can be found at http://www.w3schools.com/xml.) Just as the content of a database can be converted into a report, XML documents can be read and transformed by other software applications into user-friendly formats such as spreadsheets, PDF files, or Web pages.

The following example shows how textual information about a survey could be presented in XML.

The same information converted into XML using DDI tags would look like this:

 <titl>Multiple Indicator Cluster Survey 2005</titl> <altTitl>MICS</altTitl> <AuthEnty>National Statistics Office (NSO)</AuthEnty> <fundAg abbr="UNICEF">United Nations Children Fund</fundAg> <collDate date="2005-01" event="start"/> <collDate date="2005-03" event="end"/> <nation>Popstan</nation> <geogCover>National</geogCover> <sampProc>5,000 households, stratified two stages</sampProc> <respRate>98 percent</respRate>

The use of tags is particularly powerful when a user community agrees on a common set of tags (such as DDI or DCMI standards). Adoption of a common set of XML tags offers major advantages in documenting microdata including creation of a comprehensive "checklist" of useful metadata elements; potential to assess file contents by determining whether particular tags are, or are not, within that file; creation of a dataset catalog which can be queried for key metadata elements; and potential to transform the file into more user-friendly formats. XML files can be converted into HTML, PDF, or other documents using XSL Transformations, or exchanged across networks or the Internet using web services or SOAP. An example of the application of "XSL Transformation" to the earlier XML file is the following HTML web page:

Metadata standards and models | IHSN (1)

Data Documentation Initiative (DDI)

Traditionally, data producers wrote text-based codebooks. To take full advantage of web technology, most standards are now defined in XML language. The DDI is a standard dedicated to microdata documentation that enables documentation of even the most complex microdata files in a way that is simultaneously flexible and rigorous. It provides a straightforward means of recording and communicating all the salient characteristics of microdatasets.

The DDI Alliance, hosted by the University of Michigan, maintains the DDI metadata standard. Its website provides detailed information on the DDI Codebook and DDI Lifecycle specifications, and provides a catalog of tools that make use of the DDI standard.

Download

The DDI Alliance maintains two versions of the DDI specification:DDI Codebook (which is used and recommended by the IHSN), and DDI Lifecycle (a more complex version of the specification). The DDI Codebook is a major transformation of the once-familiar electronic “codebook” and retains the same set of capabilities but greatly increases the scope and rigor of the information contained therein.

The DDI metadata specification originated in the Inter-university Consortium for Political and Social Research (ICPSR), a membership-based organization with more than 500 member colleges and universities worldwide. It is now the project of an alliance of institutions in North America and Europe. Member institutions comprise many of the world’s largest data producers and data archives.

The DDI specification addresses the types of data resulting from surveys, censuses, administrative records, experiments, direct observation, and other systematic methodologies for generating empirical measurements. For example, the units of analysis could be individual persons, households, families, business establishments, transactions, countries, or other subjects of scientific interest. Similarly, observations may consist of measurements at a single point in time in a single setting, such as a sample of people in one country during one week. Or they may comprise repeated observations in multiple settings, including longitudinal and repeated cross-sectional data from many countries, as well as time series of aggregated data. The DDI specification also provides for full descriptions of the study’s methodology (e.g., mode of data collection, applicable sampling methods, universe, geographical areas of study, responsible organization and persons, and so on).

Structure

The DDI specification permits all aspects of a survey to be described in detail: Methodology, responsibilities, files, and variables. It provides a structured and comprehensive list of hundreds of elements and attributes that may be used to document a dataset, although it is unlikely that any one study would use all of them. Some elements, however, such as “Title,” are mandatory and must be unique. Other elements are optional and can be repeated, such as “Authoring Entity/Primary Investigator,” since it includes information on the person(s) and/or organization(s) responsible for the survey. DDI Codebook (version 2.n) elements are organized in five sections:

Section 1.0: Document Description

A study (i.e., survey, census or other) is not always documented and disseminated by the same agency as that which produced the data. Therefore, it is important to provide information (i.e., metadata) not only on the study itself, but also on the documentation process. The Document Description consists of an overview—the “metadata about metadata”—describing the DDI-compliant XML document.

Section 2.0: Study Description

The Study Description is an overview of the study and includes information on how the study should be cited; who collected, compiled, and distributed the data; a summary (abstract) of the data content; details of data collection methods and processing; and so on.

Section 3.0: Data File Description

This section describes each data file’s content, record and variable counts, version, producer, and so on.

Section 4.0: Variable Description

This section presents details of each variable, including literal question text, universe, variable and value labels, derivation and imputation methods, and so on.

Section 5.0: Other Material

This section allows for descriptions of other material related to the study. These can include documents such as questionnaires, coding information, technical and analytical reports, and interviewers’ manuals; data processing and analytical programs; photos; or maps.

Dublin Core Metadata Specification (DCMI)

The following content is derived from the DCMI website (http://dublincore.org).

The DCMI Metadata Element Set (ISO standard 15836), also known as the Dublin Core metadata standard, is a simple set of elements for describing digital resources, especially those resources related to microdata, such as questionnaires, reports, manuals, data processing scripts and programs, etc.

Download

The DCMI Metadata Element Set (ISO standard 15836), also known as the Dublin Core metadata standard, is a simple set of elements for describing digital resources. This standard is particularly useful in describing resources related to microdata, such as questionnaires, reports, manuals, data processing scripts and programs, etc. It was founded in 1995 by the Online Computer Library Center (OCLC) and the National Center for Supercomputing Applications (NCSA) at a workshop in Dublin, Ohio. Over the years, it has become the most widely used standard for describing digital resources on the Web and was approved as an ISO standard in 2003. The standard is maintained and further developed by the DCMI, an international organization dedicated to the promotion of interoperable metadata standards.

A major reason behind the success of the Dublin Core metadata standard is its simplicity. From the outset, it has been the goal of the designers to keep the element set as small and simple as possible to allow the standard to be used by non-specialists. The standard also makes it easy and inexpensive to create simple descriptive records for information resources, while providing for effective retrieval of those resources on the Web or in any similar networked environment.

In its simplest form the Dublin Core consists of the following15 metadata elements, all of which are optional and repeatable: Title; Relation; Rights; Subject; Coverage; Date; Description; Creator; Format; Type; Publisher; Identifier; Source; Contributor; Language.

ISO 11179 - Information Technology - Metadata registries (MDR)

The International Standard ISO/IEC 11179-1 was developed by the Joint Technical Committee ISO/IEC JTC 1, Information technology, Subcommittee SC 32, Data management services. "ISO/IEC 11179 describes the standardizing and registering of data elements to make data understandable and shareable. Data element standardization and registration as described in ISO/IEC 11179 allow the creation of a shared data environment in much less time and with much less effort than it takes for conventional data management methodologies." (Source: ISO-IEC 1999, available at http://metadata-stds.org/11179-1/ISO-IEC_11179-1_1999_IS_E.pdf)

Statistical Data and Metadata Exchange (SDMX)

Focusing on time series and indicators, SDMX is the result of a joint effort between the Bank for International Settlements, European Central Bank (ECB), EUROSTAT, International Monetary Fund (IMF), Organization for Economic Cooperation and Development (OECD), United Nations (UN), and World Bank (WB) to create an XML specification to support the exchange of aggregate data and metadata. SDMX provides three types of statistical metadata standards: Standards for data formats, standards for metadata, and a registry-based architecture to implement these standards and to exchange data between systems.

One of the requirements of SDMX was coordination with other metadata specifications such as the DDI. Any of the DDI metadata, which emphasize archival metadata and microdata rather than aggregate data, are exchangeable in an equivalent SDMX metadata format. This ensures inter-operability of metadata across namespaces.

SDMX stands for Statistical Data and Metadata Exchange-the electronic exchange of statistical information. Its goal is to explore e-standards that could allow us to gain efficiency and avoid duplication of effort in our own work and possibly in the work of others in the field of statistical information.

Download

Generic Statistical Business Process Model (GSBPM)

The GSBPM describes statistical processes, such as the implementation of a survey, in nine phases, each divided into sub-processes:

  • Specify the data needs
  • Design
  • Build
  • Collect (includes data entry)
  • Process (includes data editing)
  • Analyze
  • Disseminate
  • Archive
  • Evaluate

In addition to these nine phases, the GSBPM includes two overarching components: Quality Management and Metadata Management.

The GSBPM provides a framework to describe the statistical production process in terms of standard components (i.e., phases and sub-processes). It is intended to apply to all activities undertaken by producers of official statistics, at both the national and international levels, which result in data outputs. It is designed to be independent of the data source, so it can be used for the description and quality assessment of processes based on surveys, censuses, administrative records, and other non-statistical or mixed sources.

Download

Generic Statistical Information Model (GSIM)

GSIM is a reference framework of internationally accepted definitions, attributes, and relationships that describe the information used in the production of official statistics and information objects. This framework enables generic descriptions of the definition, management, and use of data and metadata throughout the statistical production process.

UN Economic Commission for Europe -

GSIM is a reference framework of internationally accepted definitions, attributes, and relationships that describe the pieces of information that are used in the production of official statistics and information objects. This framework enables generic descriptions of the definition, management, and use of data and metadata throughout the statistical production process.

Download

GSIM provides a common language to describe information that supports the entire statistical production process, from the identification of user needs through the dissemination of statistical products.

GSIM is aligned with relevant data management and exchange standards, such as DDI and SDMX, but is not directly tied to them, or to any specific technology.

GSIM is not software, nor an information technology (IT) standard. It is a strategic approach and a new way of thinking, designed to bring together statisticians, methodologists, and IT specialists to modernize and streamline the production of official statistics.

The previous information was extracted from the GSIM brochure, available at the GSIM website.

Metadata standards and models | IHSN (2024)
Top Articles
Exploring the ‘Venice of the North’: Top 10 Things to Do in Amsterdam - YMT Vacations
DeFi Coins - Decentralized Finance - Top 50 List | Coinranking
Star Wars Mongol Heleer
Main Moon Ilion Menu
Occupational therapist
How To Do A Springboard Attack In Wwe 2K22
30 Insanely Useful Websites You Probably Don't Know About
Unitedhealthcare Hwp
Georgia Vehicle Registration Fees Calculator
Call Follower Osrs
Mivf Mdcalc
Globe Position Fault Litter Robot
No Credit Check Apartments In West Palm Beach Fl
Maxpreps Field Hockey
R/Altfeet
Aces Fmc Charting
Sports Clips Plant City
Socket Exception Dunkin
Craigslist Pets Longview Tx
Studentvue Columbia Heights
Ostateillustrated Com Message Boards
Arre St Wv Srj
Powerball winning numbers for Saturday, Sept. 14. Check tickets for $152 million drawing
Wemod Vampire Survivors
Spectrum Outage in Queens, New York
Is Poke Healthy? Benefits, Risks, and Tips
Weather October 15
How rich were the McCallisters in 'Home Alone'? Family's income unveiled
Jail Roster Independence Ks
Robert A McDougal: XPP Tutorial
J&R Cycle Villa Park
Exploring TrippleThePotatoes: A Popular Game - Unblocked Hub
Helloid Worthington Login
Academic important dates - University of Victoria
Walgreens Agrees to Pay $106.8M to Resolve Allegations It Billed the Government for Prescriptions Never Dispensed
Taylor University Baseball Roster
Cookie Clicker The Advanced Method
Verizon Outage Cuyahoga Falls Ohio
Barstool Sports Gif
Gasoline Prices At Sam's Club
Guy Ritchie's The Covenant Showtimes Near Grand Theatres - Bismarck
Stranahan Theater Dress Code
'The Nun II' Ending Explained: Does the Immortal Valak Die This Time?
Borat: An Iconic Character Who Became More than Just a Film
How the Color Pink Influences Mood and Emotions: A Psychological Perspective
Caesars Rewards Loyalty Program Review [Previously Total Rewards]
The Jazz Scene: Queen Clarinet: Interview with Doreen Ketchens – International Clarinet Association
Ubg98.Github.io Unblocked
Naughty Natt Farting
Obituary Roger Schaefer Update 2020
Latest Posts
Article information

Author: Margart Wisoky

Last Updated:

Views: 5770

Rating: 4.8 / 5 (58 voted)

Reviews: 89% of readers found this page helpful

Author information

Name: Margart Wisoky

Birthday: 1993-05-13

Address: 2113 Abernathy Knoll, New Tamerafurt, CT 66893-2169

Phone: +25815234346805

Job: Central Developer

Hobby: Machining, Pottery, Rafting, Cosplaying, Jogging, Taekwondo, Scouting

Introduction: My name is Margart Wisoky, I am a gorgeous, shiny, successful, beautiful, adventurous, excited, pleasant person who loves writing and wants to share my knowledge and understanding with you.