Postulate One
Sergey Skrebnev
new SpectrogramDataset package
Released package to convert audio folders to torch datasets of spectrograms, labels and masks, useful for training.
JaVAD: Just Another Voice Activity Detector
Released JaVAD: Just Another Voice Activity Detector, SOTA vad open source package that is faster than every existing open source package I tested it against.
JADIA: Dialogue diarization package
JANET (Directional Alignment) -based diarization package v0.1.1 New lite model (lite_v2) provides 20% improve in Diarization Error Rate
Triplet Loss
Contrastive loss(es)
JADIA-Plot: visualization module for JADIA
Simple way to visualize JADIA results.
pip install jadia-plot
JADIA: Dialogue diarization package
JANET (Directional Alignment) -based diarization package v0.1 Uses K-means, best at diarizing dialogues with two people.
Cosine distance (and Cosine similarity)
TQDM wrapper & custom progress bar
Tired of dealing with TQDM logging issues, created a TQDM for CometML and W&B. It detects if W&B or Comet ML module is up and replaces TQDM logger with a custom logger with large periods between updates.
Hinton's Dynamic Routing between capsules
What queries, keys and values are?
Whisper Fine-tuning script
Simple Whisper trainer, directly compatible with OpenAI repo (no need for Huggingface libraries)
How positional encoding works in transformers?
Why do we need biases
Why do we need activation functions
Directional Alignment layer
Repo for custom Directional Alignment layer created to better extract features from speech spectrograms, compared with Conv1d/Conv2d Resnet-like stacks.
Forward-Forward
Custom implementation of George Hinton's Forward-Forward algorithm with rather stupid but working loss.
Tiny custom PDF chat
FastAPI/uvcorn custom microservice to process PDF and extract information using ChatGPT
Improving Speaker Verification by introducing Alignment layer
Custom Alignment layer to improve extraction capabilities.