mrseeker

mrseeker

New dataset: "Erebus"

Added 2022-08-31 12:19:19 +0000 UTC

I have finally done it: I am currently cleaning and compiling a new dataset called "Erebus".

Some specifications:
Size: 4Gb (200k stories) (before cleaning)
Contains:
- Dataset G (Pixiv)
- Literotica (4.5 or higher)
- Sexstories (90% or higher)
- Pike (selected on "Adult" stories)
- Doc's Lab (90% or higher)
- SoFurry (mixture of various tags)

The dataset needs to be pruned of all the short <10kb stories, will be cleaned up using the same settings I did for Nerys-v2, and most likely will have multiple variations made (2.7B, 6B, 13B, 30B). I am debating whether I should use Nerys-V2 as the base or just the base model.

The 30B version will most likely be privately hosted for those that would like to spread the word using the KoboldAI-cluster. Please note that since I am eating the cost of running the server, it will most likely be available for patrons and supporters, and I won't be running it 24/7.

On a related matter, currently having issues with Runpod being hoarded by GPU miners. Although I am partly responsible for this (sorry), I have to say that the income I get from it does pay most of the bills. I hope that shortly it will even be able to pay a KoboldAI instance fully. But to get to that point, I need to do some legwork.

Comments

Hello Mr. Seeker, I tried out the Erebus model and I think its quite good. I was not really impressed with the Shinen13B model so I was still using the Shinen 6B when I wanted to do NSFW stuff. As for the Shinen 13B model, I found it not really responsive. It would follow what I imput quite well but was rather bland if you just try to let it tell a story even at high temperature. So far though, I am very impressed with Erebus. I am finding that it starts to loop and repeat very quickly although this could be due to my own prompts and experimenting with the settings. Thank you for this hard work!

Tony V

2022-10-02 17:58:54 +0000 UTC

Thats a great idea, but I think I need to know what the cut-off metric should be (https://pypi.org/project/py-readability-metrics/)

Julius

2022-08-31 13:05:53 +0000 UTC

What do you think about further limiting the dataset based on something like Flesch-Kincaid Grade Level? There are a lot of texts out there that are considered "good" rating wise but are very simplistic which could lead to simplistic output for those of us that can't write well. Something like textstat in python could give a lot of grade level stats.

ebolam

2022-08-31 12:50:37 +0000 UTC

More Creators

秋空もみぢ

秋空もみぢ

fantia

庵@　skeb募集中

庵@　skeb募集中

fanbox

VIBAMP

VIBAMP

patreon

shadow_portal

shadow_portal

patreon

Louiselatex

Louiselatex

patreon

miwol

miwol

patreon

Ein Blitz

Ein Blitz

patreon

OrionPax09

OrionPax09

patreon

nutcaseart

nutcaseart

patreon

theBBBroom

theBBBroom

patreon

counolia

counolia

fanbox

Fracture AI

Fracture AI

patreon

NEK0

NEK0

gumroad

OnedollarVR

OnedollarVR

patreon

Bear Muscle Uncensored by Kris Chaser

Bear Muscle Uncensored by Kris Chaser

patreon

Noffa

Noffa

patreon

myrkky

myrkky

patreon

CasulRain

CasulRain

patreon

ROO310

ROO310

fanbox

森

fanbox

Cubeskar

Cubeskar

patreon

cumsonics

cumsonics

patreon

nsjxscz

nsjxscz

patreon

Smol Bean Creations

Smol Bean Creations

gumroad

Tomokit_

Tomokit_

patreon

The Just Bake Channel

The Just Bake Channel

patreon

かくうめ/kakuume

かくうめ/kakuume

gumroad

JacksOWO

JacksOWO

patreon

ぽじ太郎

ぽじ太郎

fanbox

OrphieLLL

OrphieLLL

patreon

bcottontail

bcottontail

fanbox

Archipote's games

Archipote's games

patreon

ユキナリ

ユキナリ

fanbox

Yourfriendm00n

Yourfriendm00n

patreon

Vault Ishimura

Vault Ishimura

subscribestar

毒牙丸

毒牙丸

fanbox

gamevvv

gamevvv

patreon

Gsusart2222

Gsusart2222

patreon

OlexeyOleg

OlexeyOleg

gumroad

maxwell

maxwell

gumroad

zketcherzsmut

zketcherzsmut

patreon

CheddarStore

CheddarStore

gumroad

杏くるす

杏くるす

fanbox

加瀬

fanbox

2ta

2ta

patreon

AucyonProject

AucyonProject

patreon

Deccu

Deccu

patreon

ti17

ti17

fanbox

artotter92

artotter92

patreon

DegenerateFutaEnjoyer

DegenerateFutaEnjoyer

patreon