Skip to content
View truevektor's full-sized avatar
  • Warsaw

Highlights

  • Pro

Block or report truevektor

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
truevektor/README.md

Michał Trojaczek

Criminal Procedure • Audio Forensics • Digital Evidence • Data Analysis
I use GitHub as a learning laboratory and a platform for transparent, reproducible technical-legal research.


👋 About Me

I work across the intersection of:

  • criminal procedure and evidence law,
  • audio forensics (diarization, transcription, signal analysis),
  • digital evidence engineering and chain-of-custody reconstruction,
  • technical building assessments (heating systems, thermal audits, documentation).

My repositories document how law, technology, measurement, and methodical reasoning can be integrated into reproducible workflows.

GitHub allows me to:

  • experiment with tools such as Python, Whisper, pyannote, ffmpeg,
  • build full evidence repositories for real cases (e.g., II K 70/24),
  • teach others how technical and legal inquiry complement one another.

🎓 GitHub Education – How I Use It

Within GitHub Education and the Student Developer Pack, I focus on:

  • building open case studies for students of law, forensics, and data science,
  • creating step-by-step pipelines for evidence analysis:
    • audio preprocessing,
    • diarization and transcription,
    • timing consistency checks,
    • comparison between courtroom recordings and official transcripts,
  • documenting research-ready, reproducible workflows.

My repositories are designed so that anyone can:

  1. clone the project,
  2. follow the README instructions,
  3. reproduce the analysis with full transparency.

🔍 Current Learning & Research Areas

  • advanced diarization and speaker-tracking (pyannote),
  • forensic-grade signal analysis (formants, silence segmentation, artifact detection),
  • automated procedural reporting (Python → DOCX/PDF pipelines),
  • GitHub Actions for:
    • auto-updating statistics,
    • generating forensic summaries,
    • validating data integrity in multi-file repositories.

🧪 Featured Educational Projects

IIK_70_24 – Digital Evidence Repository

A structured, multi-layered reproduction of a real criminal case, including:

  • evidence indexing (“digital case file” format),
  • audio recordings (M4A/OGG/WAV) + diarization,
  • transcript comparison tools,
  • chain-of-custody reconstruction,
  • educational notebooks explaining the methodology.

Intended use: legal tech courses, forensic audio workshops, digital evidence methodology training.


fonoskopia-tools

A modular pipeline for:

  • audio conversion and resampling (ffmpeg),
  • diarization (pyannote),
  • Whisper-based transcription,
  • formatting transcripts for legal review.

Includes well-commented workflows intended for beginners and advanced students.


joliot-curie-19a

A technical repository documenting:

  • heating & hot-water system diagnostics,
  • 3D measurement workflows (Leica DISTO X6),
  • thermal imaging interpretation (FLIR C5),
  • building audit methodology.

Intended use: building-science labs, civil engineering students, interdisciplinary research combining law & engineering.


🛠️ Tech Stack

Languages & Tools

  • Python (analysis, automation, reporting)
  • ffmpeg, sox (audio processing)
  • Whisper, pyannote-audio (speech & diarization)
  • Git, GitHub (Actions, LFS, structured evidence repositories)
  • Audacity / Adobe Audition (signal inspection)

Hardware

  • Dell Precision 7550 (CUDA)
  • Samsung Galaxy S24 Ultra (raw recordings, photogrammetry)
  • Leica DISTO X6, FLIR C5, Bosch GLL 3-80 C
  • HP Color LaserJet Pro M252dw (documentation output)

📈 Automated Stats

This profile uses GitHub Actions to automatically generate:

  • commit-activity graphs,
  • language-usage statistics,
  • repository analytics.

Example outputs:

Commit activity
Languages


🧭 Assignments / Labs (GitHub Education Compatible)

The following exercises are designed for students, educators, and researchers who want to explore digital forensics, legal-tech analysis, or data-driven methodology.

Lab 1 — Audio Evidence Integrity Check

Objective: Reproduce a basic forensic workflow.
Tasks:

  1. Clone fonoskopia-tools.
  2. Convert the sample audio files using ffmpeg.
  3. Run diarization and generate a segment map.
  4. Compare diarization timestamps with the transcript.
  5. Submit a short report explaining discrepancies.

Lab 2 — Transcript vs Recording Consistency

Objective: Identify procedural anomalies.
Tasks:

  1. Load a provided courtroom audio segment.
  2. Generate a Whisper transcription.
  3. Compare it line-by-line with the “official transcript”.
  4. Highlight mismatches related to:
    • omission,
    • misattribution of speakers,
    • altered sequencing.
  5. Discuss the impact on fair-trial guarantees.

Lab 3 — Chain of Custody Reconstruction

Objective: Understand digital evidence lifecycle.
Tasks:

  1. Navigate the IIK_70_24 directory structure.
  2. Review metadata in the metadane/ folder.
  3. Build a timeline of evidence creation, copying, and storage.
  4. Identify any “breaks” in the chain.
  5. Submit a formal structured report.

Lab 4 — Technical Building Audit (Interdisciplinary)

Objective: Apply engineering measurement workflow.
Tasks:

  1. Review photographic and thermographic documentation in joliot-curie-19a.
  2. Analyze heat-loss indicators and insulation patterns.
  3. Correlate sensor readings with structural documentation.
  4. Produce a short engineering-legal assessment.

Lab 5 — Automated Reporting with GitHub Actions

Objective: Learn reproducible research using CI.
Tasks:

  1. Copy the provided workflow.
  2. Configure a daily stats updater.
  3. Add a Python script generating a PDF summary.
  4. Publish results to the repository’s README.

🤝 Collaboration and Academic Use

I welcome collaboration from:

  • students and researchers in law, digital forensics, signal processing,
  • engineering and data-science programs,
  • instructors looking for real-world, reproducible case studies.

For academic or research inquiries:
michal@trojaczek.com


“Theory ends where evidence begins. The rest is documentation.”

Popular repositories Loading

  1. truevektor truevektor Public

    My personal repository

  2. skills-communicate-using-markdown skills-communicate-using-markdown Public

    My clone repository

  3. skills-introduction-to-github skills-introduction-to-github Public

    Exercise: Introduction to GitHub

  4. whisper.cpp whisper.cpp Public

    Forked from ggml-org/whisper.cpp

    Port of OpenAI's Whisper model in C/C++

    C++