AbstractPhila's picture
Open to Collab

AbstractPhila PRO

AbstractPhil

AI & ML interests

datasets, research papers, experimentation, vision, classification, text encoders, tokenization, llms, diffusion, distillation, and more.

Recent Activity

published a dataset about 1 hour ago
AbstractPhil/bertenstein-v1
posted an update about 2 hours ago
geolip-bertenstein-v1 - 5 experts chosen. A collective of shared transformer aligned experts, not a mixture of experts. Similar to a MOE, but not quite. This first prototype won't have the full mailing projection relay system afforded by the geofractal router, but it will definitely be a solid prototype. This model will be able to text respond, use whisper, see with dinolip, code with codebert, and process proteins using esm2_t33_650m_ur50. Our experts for the prototype are; google-bert/bert-large-uncased facebook/dinov2-large microsoft/codebert-base openai/whisper-large-v3 facebook/esm2_t33_650M_UR50 Not the smartest text model, but more than enough for this preliminary use case test setup. Text is predominantly meant to align and orient downward function, the entire machine is meant to be operated unilaterally as a collective, or independently through individual pairs requests via special token access. This model will be capable of substantial power and feats as a prototype. It will be capable of seeing and processing differential equations utilizing dinov2 and esm2 data simultaneously, which can be used for downstream analysis - and I WILL use that data to create a more powerful connection between dinov2 tokens, protein tokens, video tokens, code tokens, and audio tokens. This is the FIRST prototype of this case, and I will introduce video, genetics, shape analysis, pattern recognition processing, and a much more powerful and reusable text model. The tests show the models can have differential communication through the geolip transformers after procrustes pairwise analysis and pentachoron CV protective measures. Whitening procrustes for precalculation and center-aligning allows for a faster convergence, so that should help too. I'll be working to make this prototype smaller and more powerful, rather than larger.
replied to their post about 2 hours ago
I've... done it. This, with experts, achieves near 100% R1 retrieval accuracy on an adjacent - unseen by the fusion transformer - dataset with around 40k steps from the seen dataset. This means the language of the models are at least tested fused within the constraints, not just projected or estimated. https://huggingface.co/AbstractPhil/geolip-procrustes I encourage EVERYONE who is curious to check my work. Check it, double check it, and triple check it. These were aligned using COCO and then validated with Flickr. Entirely different datasets. The experts arbitrated and the alignment yielded the correct answers. Preliminary tests show that with almost no alignment requirement, the models can reach 100% R1 retrieval accuracy. Not to be confused with validation accuracy for a classification model or a text encoder's text response, this allows multispectral communication between entirely different models for direct downstream consumption with almost no training for the chosen models. I have a working procrustes experiment that learns adjacent manifolds within a reasonable spectrum and the speed is... well, 1 epoch with COCO using Bert-Large and DinoV2 that allows the models to align nearly perfectly. For some scales in the experiment it shows that the 3 set epochs aren't quite enough to align R1 to highest, while many align nearly immediately. These two were an obvious pair to pick, 60% similarity and >90% spectral similarity. The trainer transfers layers, learns embeddings, and more - all by sticking strictly to geometric boundaries and procrustes informational accumulation within a modulation model's constraints. I have many experiments to run.
View all activity

Organizations

DeepGHS's profile picture Blog-explorers's profile picture BangumiBase's profile picture Abstract Powered Research's profile picture