So, I was reading the privacy notice and the terms of use and I did read some sketchy stuff about it (data used in advertising, getting keystroke). How bad is it? Is it like chatgpt or worse? Anything I can do about it?

      • voracitude@lemmy.world
        link
        fedilink
        arrow-up
        0
        ·
        edit-2
        3 days ago

        What GPU do you have? I got the 8B model running on a 2070 Super and it’s pretty damn fast. I haven’t tried the 7B model but I can say the 1B isn’t useful for Q&A at least (might have some application in dev pipelines, like maybe in an app for automatically categorising music or files or whatever).

        In any case, if you sign up to use it on a service, expect that it’s going to be the most censored model, and also that it’s hoovering up all the data you’re giving it and probably quite a lot that you’re not, to give to the Chinese government.

          • voracitude@lemmy.world
            link
            fedilink
            arrow-up
            0
            ·
            2 days ago

            It looks like that has 4GB of RAM, so depending on the rest of your system you might actually be able to run some quantised models! You’ll need to use software that supports “offloading” the operations to system RAM which don’t fit in your GPU’s VRAM, like LMStudio (https://lmstudio.ai/).

            I recommend checking out Unsloth’s models, they specifically try to fine-tune for use on older/slower hardware like yours. Their HuggingFace is here: https://huggingface.co/unsloth

            This is the version of Deepseek you’ll wanna try: https://huggingface.co/unsloth/DeepSeek-R1-Distill-Llama-8B-GGUF

            Click one of the options on the right hand side to download the model file:

            Very basically speaking, the lower the “bits”, the smaller the file (and the dumber the model), and therefore the less VRAM and system RAM you’ll need to run it. If you get one of the 2-bit versions, you might be able to fit the whole thing inside your GPU - the 2-bit models are only ~3.2GB! You can probably run 4-bit though, even on your hardware.

            • ByteMe@lemmy.worldOP
              link
              fedilink
              arrow-up
              0
              ·
              2 days ago

              Wow, that’s a thorough explanation. Thanks! I also have 16 gigs of ram and an i7 6th gen

              • voracitude@lemmy.world
                link
                fedilink
                arrow-up
                0
                ·
                edit-2
                2 days ago

                No problem - and, that’s not thorough, that’s the cut down version haha!

                Yeah, that hardware’s a little old so the token generation might be slow-ish (your RAM speed will make a big difference, so make sure you have the fastest RAM the system will support), but you should be able to run smaller models without issue 😊 Glad to help, I hope you manage to get something up and running!