Lab 11 - Definining Virtual Services for LLM Providers

Gloo Platform EKS Workshop > Gloo Gateway (Bonus Section) > Lab 11 - Definining Virtual Services for LLM Providers

Lab 11 - Definining Virtual Services for LLM Providers

Virtual Services for LLM Providers - Lab 11

In this lab, a Gloo Gateway Virtual service that contains specifications for Open AI, Mistral and Hugging Face AI endpoints will be created, and a corresponding API Token for Hugging Face will be created as a Gloo Secret. The other services require more complex methods to obtain tokens. If you have access to the tokens yourself, you can implement them similarly to the Hugging Face implementation demonstrated in this lab. The setup will be validated using the curl command.

Define the upstream service for Hugging Face APIs:

kubectl apply -f - <<EOF
apiVersion: gloo.solo.io/v1
kind: Upstream
metadata:
  labels:
    app: gloo
  name: huggingface-upstream
  namespace: default
spec:
  static:
    useTls: true
    autoSniRewrite: true
    hosts:
    - addr: api-inference.huggingface.co
      port: 443
---
apiVersion: gloo.solo.io/v1
kind: Upstream
metadata:
  labels:
    app: gloo
  name: openai
  namespace: default
spec:
  static:
    useTls: true
    autoSniRewrite: true
    hosts:
    - addr: api.openai.com
      port: 443
---
apiVersion: gloo.solo.io/v1
kind: Upstream
metadata:
  labels:
    app: gloo
  name: mistralai
  namespace: default
spec:
  static:
    useTls: true
    autoSniRewrite: true
    hosts:
    - addr: api.mistral.ai
      port: 443
EOF

The list of upstream defined services can be listed using CLI:

glooctl get upstream -n default

Create a secret to store the Hugging Face issued token: You can get it online via the service website free of charge or the instructor will share one with you.
```
export HF_API_TOKEN=<Hugging Face Token>
glooctl create secret header hftoken --headers "Authorization"="Bearer $HF_API_TOKEN" -n default
```

Create a Virtual Service that defines listening URI, destination, and adds the token to the request:

kubectl apply -f - <<EOF
apiVersion: gateway.solo.io/v1
kind: VirtualService
metadata:
  name: ai-services-vs
  namespace: default
spec:
  virtualHost:
    domains:
      - "*"
    routes:
      - matchers:
          - prefix: /huggingface-gloo-gw
        routeAction:
          single:
            upstream:
              name: huggingface-upstream
              namespace: default
        options:
          autoHostRewrite: true
          prefixRewrite: "/models/openai-community/gpt2"
          headerManipulation:
            requestHeadersToAdd:
              - header:
                  key: Content-Type
                  value: application/json
                append: false
              - headerSecretRef:
                  name: hftoken
                  namespace: default
      - matchers:
         - prefix: /openai
        routeAction:
          single:
            upstream:
              name: openai
              namespace: default
        options:
          autoHostRewrite: true
          prefixRewrite: '/v1/chat/completions'
          headerManipulation:
            requestHeadersToAdd:
            - header:
                key: Content-Type
                value: application/json
              append: false
            - headerSecretRef:
                name: openai-secret
                namespace: default
      - matchers:
         - prefix: /mistralai
        routeAction:
          single:
            upstream:
              name: mistralai
              namespace: default
        options:
          autoHostRewrite: true
          prefixRewrite: '/v1/chat/completions'
          headerManipulation:
            requestHeadersToAdd:
            - header:
                key: Content-Type
                value: application/json
              append: false
            - header:
                key: Accept
                value: application/json
              append: false
            - headerSecretRef:
                name: mistralai-secret
                namespace: default
EOF

Test the setup:

export GLOO_GATEWAY=$(kubectl -n gloo-system get svc gateway-proxy -o jsonpath='{.status.loadBalancer.ingress[0].*}')
curl -X POST http://$GLOO_GATEWAY/huggingface-gloo-gw \
    -d '{"inputs": "What are the advantages of running Gloo Gateway in AWS?"}'

In this lab, we configured Upstream definitions for three different AI public services and created a single Virtual Service that directs customer requests according to the URI path. Since the Hugging Face service token is available to us at no charge, it has been stored as a secret and automatically added to the user request when it traverses Gloo Gateway. In the next lab, we will demonstrate how to limit access to the service based on provided JWT token information.