Skip to content

docs(db-extractors): audit & fix DB data-source connector docs against source code#970

Closed
Iamfle4ka wants to merge 1 commit into
mainfrom
devin/1781787795-db-extractor-docs-audit
Closed

docs(db-extractors): audit & fix DB data-source connector docs against source code#970
Iamfle4ka wants to merge 1 commit into
mainfrom
devin/1781787795-db-extractor-docs-audit

Conversation

@Iamfle4ka

Copy link
Copy Markdown
Collaborator

Jira issue(s): PROOF-XXX

Audited all 12 pages under src/content/docs/components/extractors/database/ against the authoritative connector source code:

  • Query-based connectors: keboola/db-extractor-{mysql,pgsql,oracle,mssql} (component IDs verified from each repo's deploy config).
  • Log-based CDC connectors: keboola/python-cdc-component (db_components/ex_mysql_cdc, ex_postgres_cdc).
  • NoSQL / other: keboola/ex-cosmosdb, keboola/ex-azure-storage-table, keboola/ex-mongodb, keboola/component-filemaker, keboola/ex-google-bigquery-v2.

Scope was accuracy fixes only — no restructuring or writing-style changes; frontmatter and screenshots preserved.

npm run build passes (254 pages, no errors). node scripts/audit-phase2.mjs reports no new findings for these pages.

Changes:

  • oracle/index.md — Wrong component URL: query-based connector linked to keboola.ex-db-mysql; corrected to keboola.ex-db-oracle (verified in keboola/db-extractor-oracle). Fixed broken anchor sqldb/#create-new-configurationsqldb/#initial-setup.
  • postgresql/index.md — Wrong component URL keboola.ex-db-mysqlkeboola.ex-db-pgsql. Connection Settings described "the MySQL server" (copy-paste from the MySQL CDC component) → "the PostgreSQL server". Data-type intro "MySQL datatypes" → "PostgreSQL datatypes". Replication Mode said the connector "reads the binlog" → "reads the transaction log (WAL)" (Postgres has no binlog). Fixed malformed SSH-tunnel link (missing closing )), AsciiDoc migration leftovers (link: prefixes, link:URL[text] form, {prodname}Debezium, stray endif::community[]), broken anchors (#create-new-configuration#initial-setup, #log-based-cdc#postgresql-log-based-cdc), and a Replication Plugin Advanced Options screenshot pointing at img_4.png (heartbeat image) → img_2.png (alt text already said img_2.png).
  • mysql/index.md — Column-mask docs linked to the Debezium postgresql connector page; corrected to the mysql connector page (mysql.html#mysql-property-column-mask-*). Fixed broken anchors #create-new-configuration#initial-setup and mysql#log-based-binlog-cdcmysql/#mysql-log-based-cdc. Converted absolute help.keboola.com data-types link to internal relative link.
  • ms-sql/index.md — Fixed broken anchor #create-new-configuration#initial-setup and a malformed inline-code link [`cdc_get_net_changes] (missing closing backtick).
  • index.md (overview) — "Azure Storage Table connector" linked to /cosmosdb/; corrected to /azure-storage-table/. Fixed broken anchor #create-new-configuration#initial-setup.
  • azure-storage-table/index.md — Converted absolute help.keboola.com links to internal relative links.

Verified accurate (no changes needed): cosmosdb, azure-storage-table (params), filemaker, bigquery, mongodb, mongodb/mapping, sqldb config fields all match current config schemas / READMEs.


Flagged for a human (product decisions, not doc bugs)

  • MySQL CDC supported versionsmysql/index.md lists MySQL 5.7, 8.0.x, 8.2, but the component README (python-cdc-component/db_components/ex_mysql_cdc/README.md) lists only 8.0.x, 8.2 (driver 8.3.0). Confirm whether 5.7 is still supported before removing/keeping it.
  • MongoDB supported versionsmongodb/index.md says "from 4.4 to the latest (6.0)", but the component now bundles mongodb-database-tools 100.15.0 (ex-mongodb/Dockerfile), which supports newer MongoDB (7.0/8.0). The "(6.0)" parenthetical is likely stale; please confirm the intended supported range.
  • Upstream component READMEs share the copy-paste bugsex_postgres_cdc/README.md itself contains the same "MySQL server" connection-settings text and "binlog" wording (lines ~409, 451–452). Recommend fixing upstream so the docs and component stay in sync.
  • Bootstrap HTML tables (migration leftovers)mongodb/index.md and mongodb/mapping.md still use raw <table class="table table-bordered"> markup. Left as-is (out of accuracy scope / would be a restructure); flagging for a future cleanup pass.
  • Pre-2022 screenshots — Steps/labels/field names on the audited pages still match the current config schemas, so screenshots were left in place. The BigQuery page's Google Cloud Console screenshots reflect Google's UI (not Keboola code) and may drift independently.

Note on the pre-audit hints: the "5 pages with .html-style internal links" did not materialize — all .html links in these pages are legitimate external URLs (debezium.io, microsoft.com, postgresql.org, etc.). The actual migration leftovers found were AsciiDoc artifacts in postgresql/index.md (fixed above).

Link to Devin session: https://app.devin.ai/sessions/fe9e6287d36141e6986c15a5c67f53d2
Requested by: @Iamfle4ka

…rors

Audited the 12 database data-source connector pages against the connector
source code. Fixes:
- correct query-based component IDs for Oracle (keboola.ex-db-oracle) and
  PostgreSQL (keboola.ex-db-pgsql)
- fix broken in-page anchors (#initial-setup, #mysql-log-based-cdc,
  #postgresql-log-based-cdc)
- correct PostgreSQL page copy-paste errors (MySQL->PostgreSQL connection
  settings, datatype mapping intro, binlog->WAL)
- point MySQL column-mask links to Debezium mysql connector docs
- repair malformed/AsciiDoc-leftover links and {prodname} placeholders
- fix wrong Azure Storage Table link on overview page
- convert absolute help.keboola.com links to internal relative links

Co-Authored-By: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
@devin-ai-integration

Copy link
Copy Markdown
Contributor

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment, CI, and merge conflict monitoring

@vercel

vercel Bot commented Jun 18, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
connection-docs Ready Ready Preview, Comment Jun 18, 2026 1:04pm

Request Review

@devin-ai-integration

Copy link
Copy Markdown
Contributor

Closing per change of plan: this is an audit-only task. No documentation edits to be merged. Findings will be posted to the Linear issue/document. Branch kept for reference.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant