Since October 7, 2016, open data has become the rule for all actors invested with a public service mission with more than 50 agents and for territories with more than 3,500 inhabitants. "While the law imposes a principle of generalized openness, each territory publishes data according to its competencies, its data assets and its practices," Data Publica observes in a very comprehensive file devoted to the standardization of open data.This booklet presents a comprehensive overview of the standardization of open data: it focuses on the challenges related to the design and reuse of open data in local governments.After recalling the growing importance of standardization in information technology, and highlighting the considerable standardization work that underlies the production of statistics, the authors of the booklet retrace the numerous initiatives in favor of open data standardization. They look in detail at cases of sectoral standards (mobility data, in particular), point out the challenges of designing standards (issues of consultation, the difficulty of choosing the "right" format), and then those of producing standardized data (standardizing data that has already been published, supporting the production of new data), before concluding with a discussion of alternatives to standardization.Open data differ from one territory to another"From one producer to another, the files do not necessarily contain the same fields or give the same level of detail," the authors of this booklet observe. "The values in the fields themselves are not standardized.Data are not named in the same way in different territories. " In addition to differences in terminology between communities, there is a more general mismatch between producers, who publish documents with their own vocabulary, and users who formulate their needs with another."In concrete terms, these issues of discoverability and data standardization limit the impact of open data. Without harmonization of practices, it is very complicated to build services or uses that go beyond a single territory. The authors of the report take as an example the Handimap application, which proposes accessible routes for people with reduced mobility by taking into account low sidewalks, and which has been hampered in its development by the lack of standardization of data on road accessibility. Each new local instance of the application required significant development to adapt to the local data.Standards allow different digital tools to communicate to build a coherent whole in the service of one or more objectives. Data standardization could thus reduce friction " by facilitating the discovery of similar open data in different territories and by making it possible to consolidate locally produced data into a national database that can be easily exploited.
Standardize data with schemas"Open data standardization is built around schemas. These are machine-readable standards, conventions that describe the fields and values allowed in a dataset that conforms to its recommendations. It is therefore by conforming to them that we produce standardized data sets. Machine-readable, the schemas are re-used in forms and interfaces for human use.Following the example of the world of statistics, where the comparability of indicators is based on their standardization, the world of geographic and environmental information has been engaged in advanced work on standardization since 2007. The INSPIRE directive proposed provisions for the interoperability of geographic and environmental data through the standardization of metadata but also of the data itself. In France, the Conseil National de l'Information Géolocalisée (CNIG) is the main developer of geographic data standards (generally regulatory standards). In 2022, the former "data" commission was renamed "standards" commission, attesting to the important place held by the design of geostandards in the activities of the CNIG.Beyond geographic data, France has recently seen a resurgence of interest and a proliferation of initiatives on the issue of standardization.The number of data schemas governing the production and reuse of data, particularly by local authorities, is increasing.Since 2018, the OpenDataFrance association, which federates local authorities committed to an open data approach, has been developing the Socle Commun des Données Locales (SCDL) to homogenize the open data publication of essential data produced by territorial actors, to help producers improve the quality of the data they publish, and to facilitate the exploitation of published data by reusers.Eight datasets previously selected as priorities have been standardized and the base is being extended.The SCDL has also spurred momentum in the state administration with the June 2019 launch of schema.data.gouv.fr, which references French standards that have been adopted by regulation or designed by the community of data producers and reusers. The site also references schemas that are under investigation and construction.
Schema.data.gouv.fr: produce schemas collaboratively to homogenize dataIn the summer of 2022, nearly fifty data schemas are referenced, including a dozen under investigation. The documentation proposed on the site allows data producers to appropriate these schemas and thus to produce data sets that respect them.Following the Prime Minister's circular of April 27, 2021, on the public policy of data, algorithms and source codes, 15 ministerial roadmaps were published on September 27, 2021.The Ministry of Territorial Cohesion's roadmap sets the objective of "encouraging the opening of data according to shared reference systems is a guarantee of quality that, in the long run, will facilitate interoperability and even the emergence of open solutions. In collaboration with associations of local authorities, pioneering territories at different scales, as well as publishers of digital solutions equipping local authorities, it is a matter of converging and promoting the best practices of standardization" (Action15).