A technical glossary for art researchers
Definitions to support historians working with digital tools
Creating a digital catalogue raisonné requires researchers to understand a wide range of digital tools, concepts, and processes. This knowledge is full of acronyms, minor variations, and interconnectedness — all of which can create confusion and communication challenges. This glossary offers short definitions of common terms so that researchers can better understand and make use of the digital landscape.
The glossary is divided into three sections: web development, general data, and standardized and authority data. The terms within each section are organized alphabetically, but we recommend using the find function (command + F on a Mac and control + F on a Windows) to search for what you need.
This is a growing resource. We’re improving definitions, adding links, and adding new terms weekly. If there’s a term you’d like to see, please let us know.
Table of contents
Standardized data
Categories for the Description of Works of Art (CDWA)
CIDOC Conceptual reference model (CIDOC CRM)
Encoded Archival Description (EAD)
Exhibition Object Data Exchange Model (EODEM)
The Integrated Authority File (Gemeinsame Normdatei or GND)
International Image Interoperability Framework (IIIF)
General International Standard Archival Description (ISAD(G))
Web development
Algorithm
The term algorithm refers to finite, well-defined instructions that solve specific problems or perform computations. Algorithms are foundational to computer science—all computer programs are specific implementations of algorithms. The term has entered the zeitgeist in reference to algorithms used in social media apps that suggest content based on previous engagement and in search engines that rank results.
Algorithms can help art historians identify patterns and similarities among digitized artwork, manage and search cataloged data, complete market and text analysis, and track visitor behavior. Google Arts & Culture uses machine learning algorithms to categorize and recommend artworks based on user preferences, and the Rijksmuseum employs AI algorithms for tasks ranging from cataloging to analyzing visitor behavior.
A/B Testing
A/B testing is a method for optimizing processes, improving user experience, and enhancing overall effectiveness based on empirical data. It’s commonly used in marketing, software development, and user experience design. Conducting an A/B text involves splitting users into two groups, labeled Group A and Group B. The groups receive slightly different versions of whatever product, text, or experience is being tested. Researchers then analyze the performance metrics of each group to ascertain the impact of the changes. This approach allows for the systematic evaluation of hypotheses by controlling and isolating variables.
Application Programming Interface (API)
An Application Programming Interface, or API, bridges computer applications, allowing them to communicate data efficiently and securely. APIs not only connect applications but also define how data requests should be made and responses returned. APIs can be used to create digital catalogues raisonnés, move data in an internal database to a public website, or share data between multiple institutions or organizations.
Artificial intelligence (AI)
Artificial intelligence, or AI, is a type of technology that enables machines and software to perform tasks that usually require human attention, such as understanding language, recognizing images, solving problems, and making decisions. To do so, AI systems use algorithms as the instructions or rules to learn from data, improve over time, and adapt to new situations. AI can be found in everyday applications, including translation tools, chat bots, and recommendation systems. Word processors and editing software may also have AI integrated into them so that they can improve their suggestions as users write.
Augmented reality (AR)
Augmented reality, or AR, integrates digital elements into views of the real world on devices such as smartphones and tablets. Users can look through their devices and see the physical world with digital additions that respond in real-time. Museums have experimented with augmented reality to more deeply engage visitors for over a decade. One example from the Art Gallery of Ontario, called ReBlink, worked with artist Alex Mayhew to let visitors use their phones to see the subjects in paintings come alive. Museum Next reported that 84% of visitors reported feeling engaged with the art, and 39% looked at the images again after using the app, demonstrating that AR can encourage visitors to take longer looks at individual artwork.
Back-end development
Backend development is about creating the part of a website or app that users don't see but that makes everything work. Backend developers manage databases, set up servers, work with APIs, and ensure everything runs smoothly behind the scenes. It involves writing code that handles how data is stored, processed, and sent to the user interface, or front end, which is what users see. Developing a catalogue raisonné requires back-end development, typically involving the development of a database.
Backlink
Backlinks refer to links that one website receives from another website. For example, when ICRA links to Navigating.art, it’s a backlink for Navigating.art and an outbound link for ICRA. The terms depend on perspective. Backlinks play a crucial role in the Internet's ecosystem by connecting different pages and boosting search engine visibility. When a website links to another site, it's essentially vouching for the content or credibility of that site. Search engines like Google use backlinks as one of their key criteria for ranking pages, with more high-quality backlinks generally indicating that a site is trustworthy and relevant.
Blockchain
A blockchain can be understood as a digital ledger system that is distributed across many computers. Whenever an entry into the ledger is written, it becomes permanently connected to the previous entry and is nearly impossible to alter. Ledger entries are called “blocks,” and blocks connected to each other are referred to as a “chain,” hence the title blockchain. What differentiates blockchains from most digital ledger systems is that everyone involved has a copy, and everyone has to agree on any new entry. For this reason, blockchains are considered “decentralized.” No single authority controls it, and everyone participating helps maintain and verify the accuracy of the records.
Blockchain technology was originally created to track transactions in cryptocurrencies like Bitcoin, but it is now being used for various other purposes, such as preserving important records and ensuring the authenticity of documents. The key idea is that it's a highly secure, transparent, and collaborative way to keep records. Several organizations have proposed using blockchain to track the provenance of digital and analog art; however, because blockchain is a new, complicated, and expensive system without standards, previous documentation methods are still preferable to most organizations in most cases.
Bug
A bug in the context of web development is an error, flaw, or simply a typo in the code that causes a program to behave unexpectedly or incorrectly. Bugs often evade detection during development because they are such small mistakes among complicated code, only to manifest during testing or after deployment. Developers meticulously debug, analyze, and rectify these issues to ensure the program runs smoothly. It’s helpful to developers when users report bugs.
Cache
A cache is your web browser’s storage area that keeps copies of frequently used data to make access faster and more efficient. If you frequently visit a certain website, the images and elements of the site might be saved in your browser’s cache. This means the site loads faster the next time you visit because the browser doesn’t need to download those elements again. If you’re working on a digital publication or presentation, your developers or IT providers will use caches as part of their development strategy to ensure web pages load as quickly as possible.
If you’re having trouble with a website or web app, clearing your cache might be the solution. Data conflicts can arise when a website is updated and the cashed material is no longer helpful.
Cascading Style Sheet (CSS)
Cascading Style Sheets, or CSS, is a programming language used to define the look and feel of websites. It allows developers to specify colors, fonts, layouts, and other visual elements to make pages more attractive and user-friendly across various devices. Developers employ CSS to orchestrate the placement of elements, creating structured and appealing designs. Researchers creating a catalogue raisonné may encounter CSS when discussing the visual appearance of their online publication or individualizing a template.
Changelog
A changelog is a documented record of all the changes, updates, and improvements made to a software project, application, or system over time. On the Navigating.art platform, for example, individual records have changelogs that provide a comprehensive editing history, tracking additions as well as corrections.
Within web development, a changelog is a detailed record of all changes made to a software project over time. It documents updates, bug corrections, new features, and improvements in a chronological order. Developers refer to the changelog to track the software's evolution and understand its development history. Each entry in the changelog notes what was altered, added, or removed, along with the version number.
Comma-separated values file format (CSV)
Comma-separated values, or CSV, is a simple file format for storing data in a table, like a spreadsheet. A CSV file is made up of rows and columns. Each row represents a single record. Each piece of data in a record is separated by commas. For example, a few rows of information about famous artists could be
Claude Monet,France,1840,1926,Impressionism
Vincent Willem van Gogh,Netherlands,1853,1890,Post-Impressionsm
Columns are created by keeping the pattern of data consistent. In this example, the columns are name, location of birth, birth year, death year, and dominant style. This simple organization makes it easy for computers to quickly read, use, and exchange data as long as they know the pattern. It’s common to use CSV files in data migrations and data exports when developing and managing digitalized catalogues raisonnés and archives.
Content management system (CMS)
A Content Management System, or CMS, is a software platform that allows users to create, manage, and publish digital content on websites without needing advanced technical skills. It provides an intuitive user interface where users can easily add text, images, videos, and other media, organize content into pages or sections, and update the website. CMSes can be connected to a website by a back-end developer or have a built-in web presence. Popular CMS platforms, like Squarespace, WordPress, and Drupal, offer various templates and plugins to customize the look and functionality of a website, making it accessible for individuals and organizations to maintain an engaging and up-to-date online presence.
Cookies
Cookies track online activity. They are small files sent between a web server and a web browser when someone visits a website and stored in a user’s web browser. Companies can use cookies to improve site performance and keep users logged into accounts. However, the information gathered through cookies can also be sold to create targeted ads or used for other purposes, which is why cookies are essential for privacy. Understanding and managing cookies helps you control what information you share and who gets to see it, keeping your online activities more private. For those publishing a digital catalogue raisonné online, understanding cookies and creating a cookie policy for visitors to accept or deny is imperative.
Digitization
Digitization is the process of converting physical objects and processes into a digital format that computers can understand and use. The result of digitizing physical objects is a digital file that can be stored on electronic devices and widely shared. Digitizing helps preserve items that might deteriorate over time, including artwork and archival resources. These files can also be easily searched, making it faster to find specific information.
Digitizing a process often involves the digitalization of objects, but it also refers to the use of new digital tools and, often, a finished digital artifact. Creating a digital catalogue raisonné, for example, can require scholars to use several computer programs to complete and record research, and the result is a digital publication.
Domain name, domain
A domain is a unique name that identifies a website on the internet. It’s what is typed in your browser's address bar to visit a specific site, like "example.com" or “navigating.art.” It’s important to secure your desired domain for a digital catalogue raisonné as soon as possible. Domains can become expensive or secured by others, which adds an unnecessary burden to the research team. It’s also important to remember that domains are usually paid for through a subscription service, and it’s possible to lose a domain if it isn’t renewed or paid for on time.
A drop-down menu is a clickable element on a user interface that lets users choose from a list of options. When clicked, a list expands to display all the available choices. You can then select one or several of the options, which automatically fills in the field. Drop-down menus simplify pages and forms by reducing clutter and guiding selections. They are beneficial when entering data for a digital catalogue raisonné because they prevent mistakes and ensure consistency.
Extensible Markup Language (XML)
Extensible Markup Language, or XML, is a way of structuring data so that different computer systems can easily share and understand it. It's designed to be both human-readable and machine-readable, making it widely used for data exchange between different systems, applications, and platforms. In XML, data is wrapped in tags that are keywords enclosed in angle brackets. These tags describe the data and help organize it clearly and hierarchically. XML is commonly used to exchange data across the internet, including data used in digital catalogues raisonnés. Researchers likely come across it when exporting, importing, or migrating data from one system to another.
Feature
In software development, features are the specific capabilities or functionalities a program offers its users. Features define what the software can do, such as allowing users to create accounts, send notifications, or customize settings. Software development teams craft these features to meet specific user needs and improve the overall experience. As the software evolves, new features are often introduced to expand its abilities and appeal. Software designed to support the creation of digital catalogues raisonnés often has similar features, such as linking entries and reverse image search. However, each piece of software also has a set of unique features. Knowing what features a team needs to be successful will help them decide what software to purchase.
File formats
File formats help computer applications know how to correctly process and display the information they contain. Each file format contains specific types of information, such as text, images, audio, or video.
Common file formats include
Image Formats: .jpg (JPEG image), .png (Portable Network Graphics)
Audio Formats: .mp3 (MPEG Audio Layer III), .wav (Waveform Audio File)
Video Formats: .mp4 (MPEG-4), .avi (Audio Video Interleave)
Text Formats: .txt (plain text), .docx (Microsoft Word document)
Creating a digital catalogues raisonné regularly requires researchers to interact with files of all kinds, especially text and image files.
Front-end development
Front-end development focuses on creating the parts of a website or app that users see and interact with, including the user interface. It involves writing code to design the layout, style, and behavior of the interface, as well as adding text and other forms of content. Front-end developers enhance the overall user experience by refining visual elements and interactive features with the help of designers. Creating a digital catalogue raisonné with a custom website requires a front-end developer to program it.
General Data Protection Regulation (GDPR)
General Data Protection Regulation, or GDPR, is a law in the European Union that protects people's personal information. It defines how companies and organizations collect, store, and use personal data. The GDPR applies to any organization, regardless of its location, if it deals with the personal data of people in the EU. This means that even if a website is based in the United States, it must adhere to GDPR regulations when handling data from EU users.
The goal of GDPR is to give individuals more control over their personal information, ensuring it is handled safely and with their consent. It also gives people the right to know what data is being collected about them, to correct it if it's wrong, and to delete it if they no longer want the organization to have it. Organizations that do not follow these rules can face significant fines. If your organization has a newsletter list or a contact form, you’ll need to follow GDPR.
Google Analytics
Google Analytics is a web analytics tool that allows website owners to track and analyze their site's traffic and user behavior. It collects data on how visitors interact with a website, including the number of visitors, page views, bounce rates, and conversion rates. Through detailed reports and dashboards, Google Analytics helps users understand which pages are most popular, how visitors found the site, and what actions they take while browsing. This information enables website owners to make informed decisions to improve their site's performance, optimize content, and enhance user experience.
Google Search Console
Google Search Console is a free tool provided by Google that helps website owners and administrators monitor and improve their site's performance in Google search results. It provides valuable insights into how Google's web crawler, views a website, including the number of clicks, impressions, and the overall search ranking. Additionally, Search Console offers tools to submit sitemaps, optimize content for better visibility, and analyze which pages are performing well or need improvement. Site owners can use these insights to make data-driven decisions to enhance their site's visibility in search results.
Hypertext Markup Langauge (HTML)
Hypertext Markup Language, or HTML, is primarily used in front-end development to create web pages. It defines the structure and content of a webpage, organizing text, images, and links into a readable format. Along with CSS, HTML is also used to determine how that structure and content looks. Like all programming languages, HTML has some limitations that will need to be taken into consideration when preparing the digital publication of a catalogue raisonné.
Hypertext Transfer Protocol (HTTP)
Hypertext Transfer Protocol, or HTTP, is a protocol used for transmitting data over the web. In a web address, such as http://navigating.art, the "http://" part indicates the transmitting protocol being used, and "navigating.art" is the domain name. The domain tells your browser where to go, and the protocol tells your browser how to communicate with the server hosting the domain. Other protocols exist but are rarely visible to regular browser users.
Internet protocol address (IP address)
An Internet Protocol address, or IP address, is a unique identifier assigned to each device connected to a network. A device is anything connected to that network, such as every computer, smartphone, and tablet connected to the internet. IP addresses enable devices to locate and communicate with each other.
Input field
An input field is an area within a user interface or a database where data is entered or stored. Fields are designed to capture and hold different types of information, such as numbers, text, or dates, within a defined structure. Developers can define fields to ensure the data collected is accurate and relevant. By validating and organizing data in fields, developers improve the functionality and reliability of software systems. All computer programs designed for catalogue raisonné development have fields dedicated to specific information, such as names, dates, and institutions.
Language model
A language model is a type of artificial intelligence designed to understand and generate human language. It analyzes text data to learn patterns and relationships between words to predict what word or phrase comes next in a sentence. Developers train language models using vast amounts of text data, enabling them to assist with tasks like translation, summarization, and text generation. By synthesizing language patterns, the model can simulate human-like responses and interactions. Researchers preparing a digital catalogue raisonné likely encounter language models in translation tools and word processors.
Machine learning (ML)
Machine learning, or ML, is a way for computers to learn from data and improve their performance on tasks over time without being explicitly programmed for each specific task. Instead of telling the computer exactly what to do, programmers give it examples and data, and the computer finds patterns or rules on its own. For example, programming a computer to recognize paintings by Vincent Van Gogh involves showing it hundreds of paintings by Van Gogh and other painters. The computer then learns what features make a Van Gogh painting different from others, allowing it to identify new Van Gogh paintings. Painting identification is difficult for ML to accomplish, and it isn’t always correct — it works far better with simple tasks like identifying photos of cats. Instead, ML is often used in museums to predict visitor patterns, develop social media strategy, and other administrative tasks.
Natural language processing (NLP)
Natural Language Processing, or NLP, combines artificial intelligence, machine learning, and linguistics to make human text and speech legible to machines. Digital translators, for example, use NLP to convert the meaning of sentences and paragraphs written in one language into another. Unlike traditional translation tools, those using NLP know how to translate the meaning of idioms and common but grammatically incorrect phrases. AI tools with built-in NLP can decrease the need for external translators and make original sources available to more people.
Open access
Open access is a movement that strives to make digital information available online, free of charge and unrestricted by licensing or copyright. Open-access museum collections, for example, bring collections to the public and demonstrate their commitment to access to cultural heritage. Museums such as the Metropolitan Museum of Art, the Rijksmuseum, the Art Institute of Chicago, the J. Paul Getty Museum, and the National Gallery of Art (US) have robust open-access programs that allow people to use hundreds of thousands of images copyright free.
Open source
Open source refers to software whose code is freely available for anyone to view, modify, and distribute. This approach encourages collaboration and transparency, allowing developers to improve and adapt the software. Users can inspect the code, propose changes, and contribute to its development. Open-source projects often rely on community support to correct bugs and add new features. Some museums create open-source projects as part of their mission to support outreach, education, and innovation. The Art Institute of Chicago is a leader in this space.
Optical character recognition (OCR)
Optical Character Recognition, or OCR, is a technology that can read and convert different written or printed text, such as documents, books, or images, into digital text that a computer can understand and work with. It lets you take a picture of legible handwritten text and turn it into editable and searchable text on your computer. This is useful for tasks like digitizing printed documents, making scanned text searchable, and extracting information from images.
Pattern recognition analytics
Pattern recognition analytics refers to computer programs that identify and analyze recurring trends or conventions within data. This process uses algorithms to detect patterns in big data that might not be immediately apparent. Pattern recognition analytics helps in diverse fields by discerning underlying trends and anomalies. It can also be applied to large digitized archives and databases containing art historical information.
Pixel
A pixel is the smallest unit of an image displayed on a screen. It is a tiny dot of color that, together with many others, forms a complete picture. Pixels combine to render images on monitors, smartphones, and televisions. Each pixel's color and brightness can vary, which affects the overall image quality. In digital imaging, the resolution is often described in terms of pixel count, such as 1920 x 1080. Pixels can also contribute to image distortion if their arrangement is flawed. By analyzing pixels, developers and designers can enhance visual clarity and detail.
Prototype
In software development, a prototype is an early version of a feature or a program created to test and refine ideas. It demonstrates core functionalities before full-scale development begins. Developers construct prototypes to validate concepts, gather user feedback, and identify potential issues. This iterative process allows teams to adjust and improve the design based on real-world testing.
Quality assurance
Quality assurance ensures a program meets specified standards and performs reliably. It involves systematic testing to identify defects or inconsistencies throughout the development process and directly before releasing the software. QA specialists scrutinize functionality, performance, and security to rectify problems and enhance the user experience, ultimately ensuring the final product functions as intended. It’s wise to complete a quality assurance test on a published digital catalogue raisonné before announcing its publication.
Release
Release refers to the process of updating software and making them available to users. Releases often include bug corrections, new features, and changes to the user interface, and they always include documentation of the changes made to the program. Developers ensure users receive the latest features and updates by carefully testing and orchestrating the release. Properly managing releases is crucial for maintaining software quality and user satisfaction. Nearly all software receives updates through releases on a regular or semi-regular basis.
Roadmap
A roadmap is a strategic plan outlining software development goals and timelines. Creating a roadmap involves key people from across the organization aligning the needs of users, developers, and the market to decide on features, milestones, and deadlines. Roadmaps ensure alignment among team members and provide a framework for adjusting plans based on feedback or changes in requirements. Roadmaps can also be adopted for managing large projects, such as digital catalogues raisonnés.
Search engine optimization (SEO)
Search Engine Optimization, or SEO, is a set of practices to improve a website's visibility in search engines like Google. It's about making websites more attractive to search engines so that pages appear higher in the search results when people search for related topics. This involves using relevant keywords, creating high-quality content, and ensuring the site is easy to navigate. Doing so helps attract more visitors, potentially leading to more readers or engagement, as higher visibility in search results generally means more people are likely to find and visit your site.
Server
A server is a computer that stores, processes, and manages information, responding to requests from users or other computers to deliver specific content or functions. Servers are essential for running websites, hosting applications, storing data, and managing communications within and between different systems. Within digital art history, servers are usually discussed as the location of long-term storage for data, including text and image files. People developing digital catalogue raisonnés create an enormous amount of data, which is stored in servers via a system set up by a programmer.
Search filters
A search filter is a tool that helps you narrow search results to find what you are looking for more quickly and accurately. It works by allowing you to set specific criteria or conditions that the results must meet. For instance, you can specify a particular date range, category, or keyword, and the filter will exclude any results that don't match these conditions, showing you only the most relevant items. These filters are especially helpful for users of digital catalogue raisonnés searching for a specific subsection of an artist’s oeuvre.
Single Sign On (SSO)
Single Sign-On, or SSO, is an authentication process that allows users to access multiple applications with one set of login credentials. It simplifies user permissions, streamlines the user experience, and minimizes password fatigue. By leveraging centralized authentication, organizations can enhance security and manage user access more efficiently.
Software
A piece of software is a program that is installed and runs on a computer. Software allows users to interact with devices and perform various functions, from simple calculations to complex data processing. Standard software includes games, social media apps, word processors, web browsers, and the operating system that manages all the hardware and software on your device. Software is often used as an umbrella term for computer applications, including web applications, even though that isn’t technically accurate. If you’re creating a digital catalogue raisonné, you likely use specific software to do so.
Software as a Service (SaaS)
Software as a Service, or SaaS, is a software delivery model that allows users to skip the installation process and use the program online via their web browser. Users subscribe to the service rather than purchase and use it locally. The provider manages the infrastructure, security, and updates, ensuring the software is always functional with regular maintenance and expanding with releases. Popular SaaS products include Zoom, Dropbox, Google Workspace, and Microsoft 365.
Software stack
A software stack refers to a collection of software that works together for a single purpose. Each piece of software has a specific purpose, and the stack’s components can be customized or replaced depending on the project’s requirements. While some all-in-one platforms for creating digital catalogues raisonnés, such as Navigating.art, exist, most software solutions involve a software stack that consists of a database, a user interface, and a publishing tool.
Standardization
Standardization in data refers to data that is collected, formatted, and stored consistently and uniformly. This involves using the same methods, units of measurement, labels, and structures for data across different systems or organizations. The goal is to make it easier to combine, compare, and analyze data accurately, ensuring that everyone understands it the same way and can use it effectively. Established standards exist in and across research fields, including archival standards. These standards are often described in handbooks for manual implementation or available for digital implementation. When a program offers single input fields for each piece of information to be added, such as the birth year of an artist or museum location, that data can be consistently presented, queried, and used in data visualization projects.
Tag
A tag is a keyword or label assigned to a piece of content to categorize and identify it. Tags help organize and retrieve information efficiently. This system allows for easy searching and filtering of content based on the assigned tags. Developers may use tags to enhance search functionalities and improve user navigation. By tagging content appropriately, it becomes more accessible and manageable, facilitating better data organization and retrieval.
User experience (UX)
User experience, or UX, refers to how a person feels when interacting with a user interface. It encompasses every aspect of the user interaction, including ease of use, satisfaction, and efficiency. Effective UX design enhances usability and ensures the product meets user needs. Professionals can improve functionality and aesthetics by evaluating how users engage with the system to increase user satisfaction and retention.
User interface (UI)
A user interface, or UI, is what you normally interact with when you use a computer program. It’s the part of the program that covers the underlying technical infrastructure so that you don’t need to be a developer to use it. Rather than write code, users do things such as click buttons, choose from menus, and type in input fields in the UI. Most companies aim to create user interfaces that are easy to use without instructions. Nearly all programs designed to build digital catalogues raisonnés have user interfaces.
User testing and usability testing
User testing and usability testing are methods for evaluating a product’s design and functionality. User testing involves observing real users as they interact with a product to identify issues and gather feedback. Usability testing specifically focuses on how easily and efficiently users can achieve their goals with the product. These tests scrutinize various aspects, such as navigation, comprehension, and overall user experience.
User permissions
User permissions determine what actions individual users can do in a program. Permissions govern access to features, files, or data based on the user’s role. By delineating roles and responsibilities, permissions prevent unauthorized access, protect sensitive information, and deter accidental data deletion. For teams creating a digital catalogue raisonné, setting and managing user permissions is especially important in light of the private information they contain, including ownership, value, and insurance documentation.
Versioning
Versioning is the process of keeping track of changes made to software, documents, or files by assigning a unique number or identifier to each set of changes. This helps manage updates, improvements, and fixes over time, allowing you to see the history of changes, revert to previous versions if needed, and ensure that everyone uses the most current and accurate version. It’s a good idea to look for catalogue raisonné software or web applications that include versioning for internal working processes and for alerting readers about updates to the publication.
Web 2.0
Web 2.0 refers to the Internet between 2004 and the early 2010s, when interactivity and user-generated content increased with sites such as Facebook, YouTube, and Wikipedia. Web 2.0 technologies also support rich user experiences with features that allow users to actively participate by creating and sharing content through platforms like social media and blogs.
Web 3.0
Web 3.0 represents the third evolution of the internet, emphasizing user control and artificial intelligence. It seeks to enhance user interactions by using technologies like blockchain, which allows users to manage their own data rather than leaving it in large databases of corporations. Supporters of this approach to the Internet say it fosters greater privacy and security.
Web crawler
A web crawler, also known as a spider or bot, is a program designed to browse the internet systematically. It traverses web pages by following links from one page to another. The crawler indexes the content of each page it visits, collecting data and updating search engine databases. It can also identify and catalog new pages and detect broken links. The crawler facilitates search engine functionality by collecting and parsing vast amounts of web content. This process allows users to retrieve relevant and updated information through search queries. Web crawlers help published digital catalogues raisonnés to appear in search engine results.
Web application (web app)
A web application, or web app, is a program used through a web browser on the internet. Unlike traditional software installed on a computer, a web application runs on a server and is accessed by visiting a website. Examples of web apps include online tools for editing documents, social media platforms, and email services. Because you don't need to download or install anything, you can use web apps from any device with internet access. Most systems for developing digital catalogues raisonnés include at least one web application.
Web browser
A browser is a software application used to access and view websites on the Internet. Popular browsers include Google Chrome, Mozilla Firefox, Safari, and Microsoft Edge. If a web application, such as a library catalogue or cataloguing program, isn’t working correctly, it’s best to restart the browser or change browsers and try again. The browser is often the problem rather than the web app, and it’s far easier to fix.
Workflow
A workflow is a sequence of steps or tasks required to complete a process or project. It outlines how work progresses from one stage to the next, detailing each task's dependencies and order. Teams employ workflows to ensure tasks are executed systematically and efficiently. By mapping out each step, workflows help in managing resources and deadlines effectively. They also facilitate coordination among team members by clarifying responsibilities. Properly designing and optimizing workflows can enhance productivity and reduce errors. Adjustments to the workflow may be made based on feedback or changes in project scope.
General data
Batch processing
Batch processing is a way of handling large amounts of data by collecting it over a period of time and then processing it all at once. Instead of dealing with data as it comes in, batch processing waits until there's a large set of data, and then processes it in a single, often automated, session. This method is useful for tasks like updating databases, generating reports, or analyzing data, where it’s more efficient to handle everything together rather than one piece at a time. Data migration often includes batch processing, as opposed to the slow and steady data processing that happens in the daily work of creating a digital catalogue raisonné.
Big data
Big data refers to large and complex datasets that are usually derived from multiple sources. Because of big data’s volume and variety, researchers can use specialized tools to analyze it, gaining valuable insights, identifying patterns, and making more informed decisions. While the term big data and the tools for working with it were developed in tech companies, including Google and Amazon, art historians have found applications over the past twenty years. As the field has digitized resources, artifacts, and publications, art historians can use large primary resources like big data to uncover patterns, trends, and connections. One of the best-known big data projects is Project Cornelia, a study of 17th-century Flemish tapestry and painting using and developing datasets, data retrieval tools, and data visualization tools. Creating digital catalogues raisonnés expands available data sets and allows researchers to ask new questions about known resources.
Data
Data is any information that can be collected, stored, and analyzed. Common examples of data include numbers, text, images, sounds, and videos. Digitized art history research, artwork, and primary resources can all be treated as individual pieces of data, stored in databases, presented in data visualizations, and exchanged among digital applications. Using data tools to conduct art historical research offers unconventional strategies for creating and sharing new knowledge.
Data accessibility
Accessible is the “A” in F.A.I.R. data principles and refers to ensuring that data can be easily retrieved and used by anyone who needs it. Identifiers, licenses, protocols, and persistent metadata are integral for data accessibility. Unique identifiers ensure data can be accurately located, referenced, and cited, which is important for consistency and traceability. Licenses clarify the terms of use, enabling legal and ethical sharing and reuse of data. Protocols offer standardized methods for accessing and using data, promoting interoperability across different systems. Persistent metadata preserves essential contextual information about the data, ensuring it remains understandable and usable over time. Together, these elements enhance the accessibility, usability, and longevity of data, supporting effective data management and sharing.
Adhering to these practices makes data more available to a broader audience, fostering greater collaboration and reuse. When preparing a digital catalogue raisonné, it’s essential to plan for data accessibility and all of the F.A.I.R. principles from the beginning of the project, as they impact the tools, processes, and publication decisions.
Data architecture
Data architecture is a plan or framework for collecting, storing, managing, and using data in a computer system. It involves organizing data to make it easy to access, understand, and use. This includes deciding where data will be kept, how different pieces of data will relate to each other, and how data will flow through the system. Good data architecture ensures that data is reliable, secure, and accessible to those needing it. A well-designed data architecture in a digital catalogue raisonné is critical. It ensures computers, researchers, and readers can easily find what they’re looking for among the mass of data.
Data center
A data center is a facility that houses large numbers of computer servers and other critical infrastructure used to store, manage, and process data. These centers are designed to keep data accessible, secure, and backed up at all times. Data centers are crucial for running online services, cloud computing, and handling the massive data needs of modern businesses, governments, and other organizations. Because insurance information, purchase prices, and provenance history can be private, it’s important to ensure the data center used for digital catalogues raisonné data operate strict security protocols and redundancy measures.
Data cleaning
Data cleaning refers to the process of identifying and correcting errors, inconsistencies, and inaccuracies within a dataset. It often requires humans or computer programs to remove duplicate records, add missing values, correct data entry errors, and standardize formats. This is a crucial process. Without data cleaning, analyses can falter, misleading results may emerge, and false conclusions may emerge. Cleaning data for digital catalogues raisonnés often includes correcting the spelling of names, merging duplicate entries, and correctly formatting the dataset.
Data enrichment
Data enrichment is the process of adding more information to existing data to make it more useful and valuable. This means taking your data, like an image file, and enhancing it by adding metadata such as the artist, style, and content. It’s possible to enrich data manually, but it's far more effective to do so with digital tools. Data enrichment helps to get a clearer, more complete understanding of the data, making it easier to find, analyze, and use.
Data export
Data export refers to extracting data from a system and saving it in a specific format, which is usually done for sharing, analysis, or backup purposes. Although the term is similar to data migration, data export focuses on extracting and formatting data without necessarily involving its subsequent integration into a new system. It’s common to export data as a report while preparing a digital catalogue raisonné or preparing an analogue version of the completed publication.
Data format
Data format refers to the structure of data for storage, processing, or transmission. Various formats exist, such as CSV and XML. Each format serves different purposes and ensures compatibility with specific systems. Choosing the right format is vital for efficient data handling, as data must be parsed and interpreted correctly to maintain its usability. A standardized format can simplify integration and facilitate better data management across platforms.
Data governance
Data governance is the set of rules and processes that ensure data is managed properly in an organization. It involves ensuring data is accurate, secure, and used responsibly. This means deciding who can access the data, how it should be handled, and what to do if something goes wrong. Good data governance helps keep data organized, protects it from misuse, and ensures it’s reliable for making decisions. Governance is especially important when handling sensitive data, such as insurance information, names of private collections, and purchase prices.
Data harvesting
Data harvesting is the process of systematically collecting large amounts of data from various sources. This process often involves extracting information from websites or databases with tools and algorithms that efficiently gather and compile this data. However, not all data is available for public use, and it is essential to ensure that data harvesting practices comply with legal and ethical standards. Due to the sensitive nature and copyright laws governing much of the resources used for digital catalogues raisonnés, researchers don’t regularly use the data harvesting process.
Data import
Data import refers to the process of bringing data from an external source into a specific system, application, or database. This process typically involves transferring data from one format or structure to another compatible with the target system. The terms data import and data migration are related but have distinct meanings; data import is a focused, often simple task of bringing data into a system, while data migration is a comprehensive process that involves moving and transforming data from one system to another, ensuring that it remains usable and consistent in the new environment. When preparing a digital catalogue raisonné project, one of the first steps is often to import data from various sources into a single system.
Data lake
A data lake refers to a collection of raw data ingested from diverse sources, such as databases and text files. The data is retained without pre-processing, allowing for later cleaning analysis. Over time, data will be cataloged and refined, but the initial focus is on capturing and preserving it in its original state. Best practices for developing a digital catalogue raisonné discourage researchers from keeping data in a data lake; instead, it’s best to clear and organize data as it’s collected.
Data migration
Data migration is the process of transforming, cleaning, reformatting, and importing data from one system, application, or database to another. This could mean transferring data from an old computer to a new one, moving information from one software application to another, or shifting data to a different storage system. A successful migration ensures that all data is accurately, securely, and completely transferred so that it can be optimally used in the new location. Teams preparing a digital catalogue raisonné may encounter a data migration if they change their software stack.
Data pipeline
A data pipeline refers to a series of automated processes that transform data for a specific goal. Data pipelines typically include cleaning, enriching, and grouping data before loading it into a database or delivered to the user. Because data pipelines automatically enact a series of predetermined processes, they decrease the time and effort people need to prepare data for use.
One of the most common uses of a data pipeline in the preparation of a digital catalogue raisonné is Optical Character Recognition (OCR). A scanned document put through the OCR pipeline is automatically cleaned, analyzed by algorithms for text regions, converted into machine-readable text, and delivered as output. Because the pipeline orchestrates these steps, data is correctly processed and integrated before researchers arrive to search and analyze it.
Data visualization
Data visualization is the practice of representing data through graphical formats such as charts, graphs, and maps. These presentations illustrate patterns, trends, and relationships to help distill and synthesize information, making it more accessible and actionable. Effective visualizations can also elucidate anomalies and foster data-driven insights.
The Georgia O’keeffe Museum has interesting data visualizations about the artist’s oeuvre online, which allows readers to explore the works she completed concurrently or chart developments across her life.
Database
A database is a system for storing and organizing information to be easily accessed, managed, and updated. Think of it as a digital filing cabinet where you keep data organized. Databases allow you to quickly find and work with specific pieces of information, making it much easier to handle large amounts of data efficiently. All digital catalogues raisonnés use a database to contain and arrange all of the text and images in the publication.
Datafication
Datafication is the process of converting human activity and behavior into data via quantification and digitization. By capturing activities such as social interactions, transactions, and physical activities as data points, humans and computer programs can examine previously obscure patterns and trends and derive new insights.
Creating a structured provenance record for a painting in a digital catalogue raisonné exemplifies datafication.
This provenance list for Claude Monet’s Étretat, l'Aiguille et la porte d'Aval represents decades of relationships between artists, buyers, individuals, and social institutions. This complex history has been distilled into structured data that allows readers to search for individual pieces of information or conduct big data analysis.
Dummy data
Dummy data is fabricated information used for developing and testing computer programs, websites, and other digital publications. The dummy data is created to simulate the real data that will be eventually used, but it does not contain actual values or sensitive information. This helps assess how software handles various scenarios without compromising real user data, and teams can preemptively identify and rectify potential issues before going live. It’s common to use dummy data when developing the website for a digital catalogue raisonné before the final text is ready for release.
F.A.I.R. principles
The FAIR data principles are guidelines for managing and sharing data to make it easy to find, access, and use. Following these principles makes data more valuable because others can quickly discover, access, understand, and use it. The acronym stands for “Findable, Accessible, Interoperable, and Reusable.”
Findable: Data should be easy for people and computers to find. This means having a clear and unique identifier for each data set (like a name or number) and including detailed descriptions so others can understand what the data is about.
Accessible: Data should be accessible. This involves storing data in a way that authorized people can access. It also means having clear instructions on accessing the data and ensuring it’s available through standard methods (like common file formats or web addresses).
Interoperable: Data should be compatible with other data and tools. This means using common formats and standards so different systems can work together. It ensures that data from different sources can be combined and compared.
Reusable: Data should be easy to use again for future research or applications. This involves providing detailed information about the data, such as how it was collected and processed, so that others can understand and use it correctly. It also means having licenses or terms that clearly state how the data can be used.
Findable
Findable is the “F” in F.A.I.R. data principles and refers to the ease with which information can be located within a system, database, or platform. Making data findable requires careful organization, indexing, tagging, and metadata so that machines or humans can locate the correct data swiftly. Improving findability helps organizations streamline workflows, reduce searching time, and increase overall productivity. Creating a user interface that enables readers to access finable data is essential when developing a digital catalogue raisonné. Filters, search functions, and clear links between pages let readers explore based on their personal interests and preferences.
Information decomposition
Information decomposition refers to the process of breaking down complex data into smaller components to analyze each piece individually. Looking at data in smaller categories or individual pieces can help researchers identify trends, anomalies, and relationships within large datasets, enabling a more granular understanding of information.
Interoperable
Interoperable is the “I” in F.A.I.R. data principles and refers to the ability of different systems, applications, or devices to work with the same data without compatibility issues. Standards and protocols that describe how data is organized are often established to achieve interoperability. Creating interoperable systems that can communicate effectively and smoothly exchange data reduces the need for manual intervention and minimizes errors. If a digital catalogue raisonné uses one computer program to manage internal data and a second program to publish that data online, it is crucial to choose interoperable systems and structure data to share seamlessly between them.
Linked data and linked open data (LOD)
Linked data refers to a method of structuring and connecting data on the web using standard web technologies like HTTP. The goal is to link different pieces of data together so that they can be easily discovered, navigated, and understood by machines. Linked data can be used in various scenarios, including within organizations for internal data management or between partner organizations where data-sharing agreements are in place. It is also common in applications where data privacy or intellectual property concerns require controlled access to the data.
Linked data and linked open data (LOD) are closely related concepts that originate from a goal to make data more interconnected and accessible. However, they differ primarily in terms of openness and accessibility. LOD is a subset of linked data that is made publicly available and accessible on the web under open licenses. Anyone can access, use, modify, and share this data without restrictions. A digital catalogue raisonné could link to open data provided by museums, galleries, or other catalogues raisonnés that have made their data open and available.
Metadata
Metadata is information that describes and provides details about the piece of data to which it is attached. For example, metadata for a photo might include details like when and where it was taken, the camera settings, and the size of the file. By providing context and additional details, metadata helps computers and people to organize, find, and understand the primary data.
Raw data
Raw data refers to unprocessed information collected directly from sources. This data has not been cleaned, organized, or analyzed. Raw data often contains errors, duplicates, or irrelevant details. Analysts must parse and refine it before extracting meaningful insights. This data can be voluminous and complex, requiring significant processing to be useful. Researchers working on a digital catalogue raisonné often deal with raw data in the form of scanned documents, images of archival materials without sufficient metadata, and biographical data imported from other institutions.
Reusable
Reusable is the “R” in F.A.I.R. data principles. It refers to the need to make data easily accessible for others to use, now and in the future. Making data reusable involves ensuring it is well-documented, standardized, and appropriately licensed. Reusable data must be described with rich metadata that clarifies its origin, structure, and limitations. This practice allows researchers to adapt and repurpose the data for new analyses or projects, ensuring that data remains valuable and applicable beyond its original purpose. For teams preparing a digital catalogue raisonné, it’s important to follow these guidelines so that future scholars can follow new lines of inquiry created by the publication.
Structured Query Language (SQL)
Structured Query Language, or SQL, is a programming language used to manage and query databases. SQL commands allow users to define data structures, insert records, and update existing information. Users can also employ it to filter and sort data, making it easier to analyze. SQL also facilitates complex operations, such as joining tables and aggregating data. The language adheres to standardized syntax, which ensures consistency across different database systems. Teams preparing a digital catalogue raisonné normally interact with a user interface rather than using SQL, but tasks including creating reports and other forms of data exports are often easier to complete using SQL directly with the database.
Structured data
Structured data is information that is organized clearly and predictably. It provides the format and organization needed for effective storage and management in databases, while databases provide the tools and systems to handle, organize, and retrieve structured data efficiently. Both structured data and databases work together to maintain data consistency and integrity. Using structured data in a digital catalogue raisonné helps scholars and readers to easily input, find, and analyze valuable research, making it accessible for further scholarship.
Unstructured data
Unstructured data refers to data that does not have a predefined format or structure. If information is recorded in text documents, emails, social media posts, or multimedia files, it’s unstructured. Analyzing and managing unstructured data can be challenging because it does not fit neatly into traditional database structures. This type of data can contain valuable information but requires migration or sophisticated tools to process effectively. Data stored without structure is often referred to as a data lake.
Categories for the Description of Works of Art (CDWA)
Categories for the Description of Works of Art, or CDWA, is a comprehensive framework developed by the Getty Research Institute to guide the documentation and cataloging of art and cultural objects. The CDWA includes detailed guidelines for describing various attributes of artworks, ensuring consistency and thoroughness in art documentation. CDWA outlines a hierarchical structure of categories and subcategories that cover all aspects of an artwork, including physical characteristics, creation, history, and contextual information. Examples include title, creator, materials, techniques, dimensions, and provenance. By providing standardized terminology and structure, CDWA ensures that artworks are documented consistently across different institutions, facilitating better communication, research, and data sharing within the art community.
CIDOC Conceptual reference model (CIDOC CRM)
The CIDOC Conceptual Reference Model (CIDOC CRM) is an international standard designed to facilitate the integration, mediation, and interchange of heterogeneous cultural heritage information. It provides a formal structure for describing concepts and relationships used in cultural heritage documentation, which helps represent complex historical, cultural, and contextual information about artworks and artifacts. The model emphasizes the importance of events and processes in cultural heritage, such as the creation, modification, and transfer of ownership of objects. This approach helps capture the dynamic history and context of cultural artifacts.
Because it covers a wide range of cultural heritage documentation, including museums, archives, libraries, and archaeological data, it enables diverse databases and information systems to share data seamlessly. It can also work with other metadata standards and ontologies. This includes linking with models like Dublin Core, enabling it to serve as a bridge between different information systems and enhancing the ability to perform cross-domain research.
CIDOC stands for the International Committee for Documentation, an International Council of Museums (ICOM) committee focused on documentation standards and practices in museum, archival, and library collections.
Dublin core
The Dublin Core is a standardized set of metadata elements used to describe a wide range of digital resources, facilitating their discovery, management, and interoperability across different systems. It consists of 15 core metadata elements — including title, creator, subject, and description — each with a standard name and definition. In addition to the 15 core elements, the Dublin Core can be extended with additional qualifiers that provide more specificity and context, such as specifying the format or type of a resource in more detail. The simplicity and adaptability make it a popular choice for ensuring metadata schemes and systems interoperability.
The Dublin Core is widely used in digital libraries, institutional repositories, and content management systems to provide consistent and standardized metadata for resource description, ensuring that digital resources are easily discoverable, accessible, and manageable.
Encoded Archival Description (EAD)
Encoded Archival Description, or EAD, provides a uniform way to represent finding aids and publish them online in a structured way, which improves the accessibility and discoverability of archival resources. EAD uses XML (Extensible Markup Language), which allows for the encoding of hierarchical information in a machine-readable way. This structure reflects the complex organization of archival materials, including series, sub-series, and individual items.
The Society of American Archivists oversees the development and maintenance of EAD, ensuring that the standard evolves to meet the needs of the archival community. Many archival management systems and software tools support EAD, facilitating the creation, editing, and publication of EAD-encoded finding aids.
Exhibition Object Data Exchange Model (EODEM)
The Exhibition Object Data Exchange Model, or EODEM, is a standardized framework designed by ICOM to streamline the exchange of information about objects loaned for exhibitions between museums and cultural institutions. It uses an XML format to encode detailed data on the objects, including descriptions, dimensions, materials, condition, provenance, loan agreements covering terms, insurance, transport, and handling instructions. EODEM also includes information about the exhibition, such as its location, dates, and specific display or security requirements. This standardization ensures consistent, accurate, and efficient communication, reducing administrative burdens and errors, and facilitating seamless data sharing across different institutions. By providing a standard format, EODEM enhances collaboration and encourages more loans and joint exhibitions, ultimately improving the management and organization of exhibitions and the loan process.
Geonames
Geonames is a geographical database that provides a wide range of information about locations worldwide, including names, coordinates, and other relevant data. It integrates over 11 million geographical names into a unified database, offering details such as country, region, population, elevation, and time zones for each location. This data is accessible through a web-based interface and can be used for various applications, including mapping, geographic information systems (GIS), and location-based services. Geonames support multilingual names and contain alternative names for places, which makes them a valuable resource for international applications. It also provides an open-source API, allowing developers to incorporate geographical data into their own software and services, enhancing functionalities like search, navigation, and location tracking. The collaborative nature of Geonames, which allows users to contribute and update information, ensures that the database remains comprehensive and up-to-date.
The Integrated Authority File (Gemeinsame Normdatei or GND)
The Integrated Authority File (Gemeinsame Normdatei or GND) is a comprehensive authority file used primarily in libraries, archives, and museums in Germany, Austria, and Switzerland for cataloging and indexing purposes. Managed by the German National Library (DNB) in collaboration with various library networks and other institutions, GND provides standardized information about entities such as people, organizations, works, and geographical locations. This system ensures consistency and interoperability in data management across different institutions by providing unique identifiers and structured metadata for each entity. GND enhances the accuracy and efficiency of cataloging, supports data linking and exchange, and improves search and retrieval of information in digital catalogs and databases. The collaborative nature of GND, with contributions from numerous institutions, ensures that the file is continuously updated and expanded, reflecting new and revised information. By facilitating precise and unified access to bibliographic and authority data, GND plays a crucial role in organizing and disseminating knowledge in the German-speaking world and beyond.
Iconclass
Iconclass is a comprehensive classification system used for cataloging and analyzing visual arts, particularly iconographic content, in artworks such as paintings, sculptures, and prints. Developed in the 1950s by Dutch art historian Henri van de Waal, Iconclass provides a structured framework for describing the subjects and themes depicted in artworks using a hierarchical code system. Each code corresponds to a specific iconographic concept, theme, or motif, ranging from religious and mythological subjects to everyday life scenes and abstract ideas. This system enables detailed and systematic documentation of visual content, facilitating research, comparison, and retrieval of artworks across collections and institutions. By standardizing the description of iconographic elements, Iconclass enhances the accessibility and interoperability of art historical data, supporting scholarly work, digital cataloging, and the creation of searchable databases for museum and library collections worldwide.
International Image Interoperability Framework (IIIF)
The International Image Interoperability Framework, or IIIF, is a set of open standards designed to enhance the accessibility, sharing, and presentation of digital images from cultural institutions like museums, libraries, and archives. IIIF allows high-resolution images and associated metadata to be delivered over the web in a consistent and interoperable manner, enabling institutions to present their digital collections with advanced features such as deep zoom, image manipulation, and annotation. It supports the seamless integration of images from different sources into a single viewing experience, making it easier for researchers, educators, and the public to access and compare artworks, manuscripts, and other visual resources. By promoting the use of common APIs, IIIF ensures that digital images are compatible across different platforms and applications, fostering collaboration and innovation in the digital humanities and cultural heritage sectors. This framework empowers institutions to provide rich, interactive experiences for their users while maintaining control over their digital assets.
General International Standard Archival Description (ISAD(G))
The General International Standard Archival Description, or ISAD(G), is a set of guidelines developed by the International Council on Archives (ICA) to standardize the creation of archival descriptions. ISAD(G) provides a comprehensive framework for describing archival materials, ensuring consistency and clarity across different institutions and collections. It outlines key elements necessary for effective documentation, such as the title, creator, extent, and content of the records, as well as administrative and custodial history. By standardizing these elements, ISAD(G) facilitates better management, retrieval, and sharing of archival information. It supports interoperability between different archival systems, enhancing the ability to search and access records globally. ISAD(G) also encourages a hierarchical approach to description, where larger bodies of records are broken down into smaller, more manageable units, reflecting the original order and context of the materials. This approach helps preserve the provenance and context of archival materials, which are crucial for historical research and interpretation.
Lightweight Information Describing Objects (LIDO)
Lightweight Information Describing Objects (LIDO) is a simple and easy-to-use system designed to help museums and cultural institutions share information about their artifacts and objects online. It is a metadata standard developed to complement existing standards like the Categories for the Description of Works of Art (CDWA) and the Dublin Core that provides a framework for describing various attributes of objects, such as their title, creator, date, medium, and provenance. Its lightweight nature makes it suitable for use in diverse contexts, including online collections and digital repositories, by enabling institutions to create and share rich, structured descriptions without the complexity of more detailed schemas. LIDO’s use of XML ensures that data can be easily integrated, searched, and retrieved across different systems, enhancing interoperability and improving access to cultural heritage information for researchers, educators, and the public.