Random File Corruption with MMDComposer5 using Markdown from ChatGPT Website
CompletedThis may or may not be related to ChatGPT markdown output, but that's where some of my drafts have been originating.
When Markdown is not proper (copied from ChatGPT.com via website with gpt-5) that formatting can get corrupted when I start editing the file.
I'll notice that the file markdown highlighting get's wonky and isn't corrupt. Then deleting or editing portions of the file start deleting the wrong sections. The file format gets corrupted and I lose information.
I may need to switch back to MM4 or we need to figure out a fix for this file corruption.
I don't know if I need a “cleanup” stage from ChatGPT so that my working draft will not cause this issue, or if there's another workaround. I don't have a screenshot, since it was on my work machine, but if it happens on my personal. I'll try to document.
I have been leary of editing critical files because I have lost information 3x now.
Steps
- Chat with ChatGPT in Safari on Mac (Sequoia)
- Use the “copy” button to get “markdown” output from the website (not macOS ChatGPT)
- Start editing, or reformatting the document (removing bold, fixing bullets, editing indentation)
- Start writing new content
Expected
- Changes at one location should not alter or delete text at another location in the file.
- Syntax highlighting for Markdown should look normal.
Actual
- Changing text (deleting, copy/paste, writing new content) should not cause other content in the document to change.
- Syntax and not highlight the first 3-5 characters in a list.
Reproducible: 3/10 attempts recently
Version: MMDComposer5 - Version 5.0.0 (2025.09.21.14.53)
-
Usually I'm trying to edit the bullet points and trying to remove the copious amounts of bold text from ChatGPT. That starts messing up my bullets, and sometimes bullets get corrupted with
`***`.Sample markdown (Unsure if this will corrupt)
Let’s break it down in plain terms — this is exactly how Carbon Copy Cloner (CCC) and your backup sizes, timing, and encryption work. --- ## 🧮 1. Drive Size Rule Your backup drive must be **at least as large as the used space** on your Mac — ideally **equal or larger than your internal drive**. | Internal SSD | Recommended Clone Drive | | ----------------- | --------------------------------------------- | | 1TB | 1TB (works fine) or 2TB (more breathing room) | | 2TB | 2TB or 4TB (future-proof) | | 8TB (future plan) | 8TB backup (when you upgrade) | If you get a **4TB external** now for a **2TB internal**, that’s perfect. It’ll handle growth and lets you keep multiple snapshots if desired. You don’t need two drives right away — you can start with one and later rotate a second for off-site safety. --- ## ⚡️ 2. Clone Speed and Interruption Behavior ### First clone: * **Duration:** 1–3 hours for 2TB (depends on drive speed — SSD over Thunderbolt is fastest). * **Use impact:** You *can* keep working. CCC runs in the background. * **If you unplug or shut down:** It just pauses or fails gracefully. You can resume later. ### After first clone: * Incremental updates are **very fast** — often just a few minutes, because only changed files are copied. You can set it to run **at night** or on **connect**, e.g.: > “When this drive is connected, update the clone automatically.” --- ## 🔒 3. Encryption Behavior Your **internal drive is already encrypted** if FileVault is turned on (System Settings → Privacy & Security → FileVault). CCC doesn’t automatically carry over that encryption — **you decide when formatting the destination drive**: ### Two choices: 1. **APFS (Encrypted)** → Secure, requires a password to unlock. * You set this once in Disk Utility. * CCC just clones into it. * You must unlock it before use (can auto-unlock if you store the password in your keychain). * Strongest privacy — if stolen, it’s unreadable. 2. **APFS (not encrypted)** → Simpler, faster, less secure. * Anyone with the drive can read data. * Fine for a first simple setup. ✅ **Start simple:** format as plain APFS first → confirm backups work → later, reformat as APFS (Encrypted) once you’re comfortable. If encrypted, the password is just for that drive — it doesn’t reuse your Mac login password automatically. --- ## 🕹 4. Time Machine in 2025 Yes, **Time Machine is still supported** and stable on macOS Sonoma / Sequoia: * Works best with **APFS** drives now (not HFS+). * Performs hourly incremental backups. * You can exclude large media folders. * Pairs perfectly with a NAS (Synology is great for this). However: * It can occasionally hang or fail if your NAS goes offline or permissions change. * It’s slower for full restores, but great for recovering single files. So CCC = “fast full restore.” Time Machine = “everyday version control.” --- ## 🧱 5. Best Starting Plan for You ### Step 1 — Prep drive * Plug in a 4TB SSD. * Open **Disk Utility** → Erase → Format: `APFS` (or `APFS (Encrypted)` if ready). ### Step 2 — Create CCC task * Choose your internal drive as **Source**. * Choose your SSD as **Destination**. * Check **SafetyNet off** (or on if you want versioning). * Schedule: “Run when drive is connected.” * Done. CCC will keep it synced automatically. ### Step 3 — (Optional later) * Add a **second drive** → store in your firebox, rotate monthly. --- ## 🧩 6. Future Workflow You’ll eventually have this setup: | Type | Tool | Drive | Frequency | Notes | | ------------------------ | ------------------ | --------------- | -------------- | ------------------------------------ | | System clone | Carbon Copy Cloner | External SSD #1 | Weekly auto | Bootable, encrypted later | | Offline safety | Carbon Copy Cloner | External SSD #2 | Monthly manual | Stored in firebox | | Incremental file history | Time Machine | NAS | Continuous | File-level restore | | Raw video archive | External HDDs | Labeled by year | Manual | Quiet storage, disconnected normally | --- If you want, I can give you the **exact Disk Utility + CCC setup steps** (what buttons to press, settings to toggle, and what to expect during the first run). Would you like that as your next step?0 -
Paul,
I don't see anything immediately wrong in the syntax highlighting from that sample. And other than the (IMO) odd use of emojis in the headers, I don't see anything that jumps out as “wrong” in the syntax. But maybe I just skimmed over it??
If it happens again, could you try a screen recording and using “Undo” to try and backtrack? That might help give me a better idea as to what is happening.
I haven't seen anything like what you're describing, so I don't have any immediate ideas as to what is happening unfortunately…
Thank you!!
0 -
Ok, I can easily reproduce this. it's related to the indentation of the file. I try to fix the indentation. I might be inserting tabs (not spaces). Not sure. and it gets all haywire. Undo is unable to fix the problems. It stays corrupted after the incorrect file contents are deleted or modified.
Video:
https://youtu.be/Wk7yrA_6TpQ
Sample Text:1. Edit this to remove top/bottom gibberish.
2. Try to fix the indentation of the bullets.Actual
It starts inserting ** in the list (with tabs?) Then formatting gets wonky, then editing starts editing in the wrong locations.Perfect — you’ve cleared the space and stabilized the commit workflow, which unlocks your core engineering loop. Here’s your **updated, trimmed-down task list** — focused only on what remains relevant now. --- ## 🧭 Current Focus (as of now) ### 🔥 1. Unified Detector Harness * Finalize the pipeline that runs on: * **Single photo** * **Photo set** * **Video frames** * Confirm frame-by-frame inference consistency (no async drift). * Integrate directly with **Arachna Detector Mac app** (you already have the overlay visualization). * Package as an **SPM module** for reuse in: * SpiderApp (TestFlight) * CLI utilities * Ensure commit workflow stays enforced (only `git status`, `git diff`, and `commit.sh` allowed). --- ### ⚙️ 2. Square Crop Logic (In Progress) * Implement consistent square cropping for training + inference. * **Logic:** * Center crop using **long edge**; if it exceeds bounds, use **short edge** fallback. * Clamp coordinates to image bounds. * Apply uniformly across training export, inference, and preview. * Propagate to: * Training crop exporter * Classifier inference path * Detector debugging pipeline * Test on edge cases (extreme aspect ratios, partial spiders near image borders). --- ### 🧩 3. Classifier Retraining (Post-Harness) * Use current best quality-gated crops. * Train **v18 classifier** with new square crops. * Validate reduction in false positives (esp. blurred recluse / false widow). * Integrate into detector harness → test live performance. --- ### 🗺️ 4. Geo-Aware Dataset Filtering * Add geolocation to Postgres (PostGIS or lat/long columns). * Prune non-US species to reduce model confusion. * Update export commands to filter by region before training. --- ### 🧠 5. Training Workflow Documentation * Continue building the **`training/` Git folder**: * `TRAINING_NOTES.md` — versioned command recipes. * `scripts/` — reproducible CLI commands (dataset exports, crops, training). * `.gitignore` — exclude all media; commit only scripts + notes. * Optional: store square-crop parameters and seed values with each model version. * Goal: every model (e.g., `classifier_v18`, `detector_v3`) has a matching **command lineage** recorded in Git. --- ### 🧘 6. Quality Work (Background) * Tag more low-quality photos when you have downtime. * Incrementally improve Quality v4 → v5 later. --- ### ✅ Active Priorities Order | Priority | Focus | | -------- | ------------------------------------------- | | 1️⃣ | Unified Detector Harness (core pipeline) | | 2️⃣ | Square Crop Logic (integrated into harness) | | 3️⃣ | Classifier Retraining w/ Square Crops | | 4️⃣ | Geo-Aware Dataset Filtering | | 5️⃣ | Training Notes & Scripts in Git | | 6️⃣ | Quality Tagging (low-energy filler work) | --- Would you like me to generate a compact Markdown checklist version (`SpiderApp_Tasks.md`) you can drop into your repo for tracking this sprint?0 -
Paul,
Thanks for that example. With that, I was able to reproduce the issue. The problem is the improperly indented list. For example, the two leading spaces in front of `* **Single photo**`. The improper list is messing up my list cleanup algorithm, so I need to figure out how to make that more resilient.
In the meantime, highlight the entire list (including through `Ensure commit workflow…`) and Cmd-[ in order to shift the lines to the left (thereby removing the leading whitespace.). Then you can indent individual lines as desired.
You would have to do this for each list. Or you could select the entire document for this in a single step, if you do not have any leading whitespace that is not part of a list (e.g. indented code blocks).
Thanks again!!
0 -
A new build has been pushed that should fix this.
Thanks again!
0
Please sign in to leave a comment.
Comments
5 comments