* feature(intelligent-search): Added API to connect to Llama.cpp in EC2 and filter the response into OR filters * updated sql to filter script and added init.sql for tables * feature(intelligent-search): Changed llama.cpp for llama in GPU now contained in API * Updated Dockerfile to use GPU and download LLM from S3 * Added link to facebook/research/llama * Updated Dockerfile * Updated requirements and Dockerfile base images * fixed minor issues: Not used variables, updated COPY and replace values * fix(intelligent-search): Fixed WHERE statement filter * feature(smart-charts): Added method to create charts using llama. style(intelligent-search): Changed names for attributes to match frontend format. fix(intelligent-search): Fixed vulnerability in requiments and small issues fix * Added some test before deploying the service * Added semaphore to handle concurrency --------- Co-authored-by: EC2 Default User <ec2-user@ip-10-0-2-226.eu-central-1.compute.internal>
28 lines
586 B
Docker
28 lines
586 B
Docker
FROM pytorch/pytorch:2.0.1-cuda11.7-cudnn8-runtime
|
|
|
|
COPY requirements.txt .
|
|
RUN pip install -r requirements.txt
|
|
|
|
WORKDIR api
|
|
COPY llama/llama/*.py llama/
|
|
COPY auth/*.py auth/
|
|
COPY crons/*.py crons/
|
|
COPY utils/*.py utils/
|
|
COPY core/*.py core/
|
|
COPY *.sh ./
|
|
COPY *.py ./
|
|
ENV \
|
|
RANK=0 \
|
|
WORLD_SIZE=1 \
|
|
LOCAL_RANK=0 \
|
|
MASTER_PORT=29500 \
|
|
MASTER_ADDR=localhost \
|
|
CHECKPOINT_DIR=/api/llama-2-7b-chat/ \
|
|
TOKENIZER_PATH=/api/tokenizer.model \
|
|
S3_LLM_DIR= \
|
|
S3_TOKENIZER_PATH= \
|
|
AWS_ACCESS_KEY_ID= \
|
|
AWS_SECRET_ACCESS_KEY= \
|
|
LLAMA_API_AUTH_KEY=
|
|
EXPOSE 8082
|
|
ENTRYPOINT ./entrypoint.sh
|