By the time you’re reading this, the landscape of Large Language Models (LLMs) has likely evolved further, but one thing remains constant: their usage in software development continues to ramp up, offering new ways to improve productivity and transform workflows.
Enhancing Software Development with LLMs
If you write software for a living, you’ve probably encountered countless articles sharing how LLMs can enhance your productivity and streamline your development workflows. Tools like Cursor (my personal favorite), GitHub Copilot, and the newly announced OpenAI Canvas are tailored specifically for software development and are great with code generation, error detection, and automating repetitive tasks. If you have access to any of these tools at work, consider yourself lucky. If not, keep reading to explore some alternatives.
Privacy Concerns in Regulated Industries
While commercial LLM solutions have privacy policies and “private mode” options, significant concerns remain, limiting their adoption in regulated industries like Finance and Healthcare. Here are a few of the main issues:
- Data Retention: Many commercial tools store and use user input to improve models and troubleshoot issues. However, regulated industries have strict rules on data retention and handling that present a direct conflict with these practices.
- Data Exposure: When proprietary code or sensitive data is processed on external servers, there’s always a risk of exposing trade secrets or PII data of your customers.
- Lack of Transparency: There is often limited disclosure about how data is handled and secured, making it difficult for organizations to verify compliance with industry standards.
Local LLMs as an Alternative
One alternative is to run your own LLM models locally—if your company policies allow it. The open-source community has created some great tools for this purpose. I have mainly used oLlama with Open Web UI as its interface in order to manage and run local models. A great 1:1 alternative to IDE-embedded solutions, similar to Cursor or Copilot, is Continue.dev, were you can configure your own models running from oLlama. However, there are a few considerations to keep in mind when exploring this route:
Hardware Requirements
Running local LLMs is resource-intensive, especially when it comes to memory usage, and they will compete for resources with whatever local dev stack you are running.
I have tested running local LLMs in different M-series Macbooks and with 16GB RAM you can run small models (~3B params) decently but the experience/capabilities will be limited. I found that the sweet spot for running medium models (8-12B) is at least 32GB of RAM (I went with 64GB to have some more breathing room).
LLM (ML) Server: If you have the budget and knowledge to build your own ML server with a few GPUs, it is no more complex than running a containerized application that you can then connect from your local network and fully offload the LLM inference workload. This is in my future projects list.
Available Models
While locally run models may not be as sophisticated as commercial solutions, many have been fine-tuned for programming languages and task requirements such as code completion, generating code boilerplates, and bug fixing. Personally, I’ve found Codestral and deepseeker-coder useful for my usual coding tasks. To explore these and other models, check out the oLlama model library or join the Local LLaMA subreddit for updates and discussions with other local LLM enthusiasts.
Final Thoughts
The role of LLMs in software development continues to expand, bringing great potential for productivity improvements. Navigating privacy and security concerns, especially in regulated sectors, is crucial. Whether you choose a commercial solution or decide to manage your own local models, the key is finding a balance that aligns with your specific needs and constraints.
If you’re experimenting with local LLMs or have some cool ways you’ve added them into your dev process, I’d love to hear about it! Drop a comment.
Thanks,
David