...
Time | Item | Lead | Notes |
1 min | Welcome & Antitrust Policy Notice | Chairs | |
2 mins | Introduction of new members | Chairs | |
2 min | Agenda review & open Action Items | Chairs | |
5 mins | Co-Chair volunteers | Chairs | |
35 mins | Presentation and discussion on tooling and workflow | ||
12 mins | Integration with Operations Team | ||
2 mins | Review of Decisions and new Action Items | Chairs | |
1 min | Next meeting | Chairs |
Recording
- link to the fileLink
Presentation(s)
...
- Welcome and Linux Foundation antitrust policy
- Introduction of new members
- Agenda review & open Action ItemsItem 1
- Daniel Hardman presented his slides and recommendations about terminology tooling
- His evaluation of existing tooling (open source and commercial) is that nothing that is still maintained really fits our needs well
- We need "just enough tooling"
- Daniel proposes the following data pipeline
- Capture—receive raw data from on-off ticket or batch submission PR or script
- Scan—human sanity check to triage, catch basic issues
- Merge—commit to repo, convert to internal data model, assign permalinks, becomes publishable
- Mature—Run (semi-)automated QA. Generate tickets. Propose "Accepted" status for community = WG. Assign tickets for curators of other communities.
- Accept—Review and adjust ticket statuses in WG meeting.
- Daniel proposed a basic data model (see the slides)
- Rieks noted: "I'm concerned about the relation between concept and term having 1 - n multiplicities rather than n - m multiplicities. To be discussed."
- Daniel proposed a process by which every stakeholder community can review and decide on the status of a term without having to necessarily agree with other communities
- Daniel proposed two major requirements for our tooling
- Major feature #1: Manage Curation
- Anybody can propose content
- Tickets are the way to change content status
- Anybody can raise a ticket
- Review tickets are tied to a community (scope)
- Each community has its own status
- Each community has its own review process and appoints one or more "curators" <== term proposed by Daniel
- Curators directly update status for their community (or admins update per instructions from a curator)
- Enforce some data integrity rules and workflows
- Track contributors, history
- Stats
- Major Feature #2: Publish
- Emit content per community
- Timely updates (realtime desirable)
- Artifacts can be styled/customized per community
- Static, searchable, indexable HTML
- One doc, or one doc per term / concept
- Stable relative links
- Programmable data (CSV or JSON) and/or API
- An example is writing a script to analyze a glossary or a group of terms
- Full metadata available
- Contribution history
- Status change history
- Major feature #1: Manage Curation
- Data and permalinks
- Live data should be in the internal data model
- Browsable in internal data model
- https://github.com/<repo>/terms/agent-119 (term named by first EN label + concept num)
- https://github.com/<repo>/concepts/119-agent (concept named by num + first EN label)
- Hyperlinked to issues
- Links are stable across changes in terms, definition text <== permalinks are in place, so terms can be deprecated and still resolve
- Published data
- Browsable in glossary data model format
- Published by communities on sites under their control (they put static HTML where they want)
- <glossary website>/agent.html (no concept links)
- Links are versioned (not guaranteed stable across releases) <== not permalinks
- Dan asked if we could use GitHub Actions to publish "live data"
- Daniel said yes, that would result in the published data reflecting the live data
- Drummond asked about how a version of a glossary can be "frozen" for a specific community, i.e., a spec
- Daniel said that the community could fork off a version of their glossary
- You can also point off to a specific version of the data at any point in time.
- Live data should be in the internal data model
- Specific tool proposals
- terminology database = github repo
- Ingest new data as Markdown documents in GitHub
- Then after processing into internal data model, still keep each "table" in as a Markdown document in GitHub
- to do QA for WG review of submitted data: new python script (Daniel volunteering but inviting others)
- to manage internal data model
- to convert from submit format to internal data model: new python script (Daniel volunteering but inviting others)
- to edit and browse internal data model: modified ESSIF / GRNet tool (the one Rieks has developed)
- to update status, add hyperlinks, propagate tags: new python script(s)
- to emit static HTML: github action hooked up to #2 in preceding bullet
- to emit programmable data (CSV, ...)—TBD
- terminology database = github repo
- Configuring a community
- Provide official name and #tag
- Identify and train curators (github handles, contact info)
- Configure artifacts
- Configure data import
- Train community on curation and publication processes
- Configuring an artifact
- Choose publication mechanism (output template, scripts, targets, collateral)
- Setup schedule or triggers for publication
- Provide selection criteria (tags, statuses)
- Test run
- Configuring data import
- One-time, ad-hoc, ongoing?
- Write and/or tune script(s)
- Dry runs with cleanup
- First import
- Trigger for deltas
- Working Group Duties
- Triage tickets
- Train communities
- Setup communities and artifacts
- Liase with communities
- Approve "Accepted" status requests
- Propose new data sources
- Configure and maintain tool integrations
- Develop output templates
- Run tools for ingestion
- Review data quality
- Other proposals—for our actions
- Publish draft glossaries from our 3 datasets
- Data sanity check
- Convert to internal data model
- Configure artifacts and export
- Assign community curators to approve
- Forcing function for tools: first cut by mid Dec?
- Designate 1 or more chairs for WG
- I will volunteer to be one
- Divvy up WG work for #1 (tickets)
- Figure out collaboration model outside WG meetings
- Modify agenda so we spend a chunk of our time working tickets
- Item 2
- Item 3
- Review of Decisions and Action Items
- Next meeting
...