LLM in jupyterhub

You can easily experiment with LLMs in jupyterhub. We provide the managed one, or you can run your own.

Make sure your /home/jovyan volume is large enough to hold the LLM model (which usually reaches hundreds GB), and ask admins to extend it if nesessary.

Run a jupyter pod with enough memory and cores for your model and appropriate GPU type.

Install the huggingface interface:

!pip install --user --upgrade diffusers accelerate transformers

Then run stable diffusion in python to generate an image:

from diffusers import StableDiffusionPipeline
import torch

model_id = "runwayml/stable-diffusion-v1-5"
pipe = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16)
pipe = pipe.to("cuda")

prompt = "An astronaut riding a horse, painting in Dali style"
image = pipe(prompt).images[0]  


Or do text generation:

from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("mistralai/Mistral-7B-v0.1")
model = AutoModelForCausalLM.from_pretrained("mistralai/Mistral-7B-v0.1")

prompt = "Hey, are you conscious? Can you talk to me?"
inputs = tokenizer(prompt, return_tensors="pt")

# Generate
generate_ids = model.generate(inputs.input_ids, max_length=30)
tokenizer.batch_decode(generate_ids, skip_special_tokens=True, clean_up_tokenization_spaces=False)[0]

Each model comes with documentation on how to use one.

The model files will be cached in /home/jovyan/.cache/huggingface folder.