Databricks has launched a new extension – Databricks Lakehouse Platform: the VS Code Extension for Databricks.
This new extension enables developers to write their code locally leveraging the powerful authoring capabilities of the IDE, connect to Databricks clusters and run code remotely, and use the software development best practices of source code control, unit testing and CI/CD directly from their favourite IDE.
“This will be the first of many planned releases and updates for teams who rely on IDEs for their development process,” said Patrick Wendell, co-founder and VP Engineering at Databricks. “We’ve invested in a new team focusing exclusively on the breadth of the developer ecosystem, and will be rolling out support for other IDEs and additional tools that enable full access to the lakehouse from third-party products.
Develop natively inside VS Code
Organisations can now build all of their data and AI applications while staying inside their IDE. Developers can author the code for their pipelines and jobs in VS Code, then deploy, test and run it in real-time on their Databricks cluster.
This will enable teams to apply software development best practices and utilise VS Code’s native capabilities for editing, refactoring, testing, and CI/CD for data and AI projects.
“My team wants to run their data workloads on the Databricks Lakehouse but develop their data apps from their IDE. Now they can write code, edit, and test in their IDE and run their code on Databricks.” – Andrew Garrido, Head of Software Development for Data Processing at Kantar
The full power of the Databricks Lakehouse in your IDE
Now that organisations can build on Databricks within VS Code, developers can perform all of their work in one location. Databricks objects can be managed inside VS Code natively with the new extension, allowing teams to stay in their IDEs and preventing context switching between applications. All of the Databricks components, such as clusters, pipelines, and tasks, are integrated into VS Code workspaces and regular workflows. The Lakehouse’s scale can be utilised to process and analyse large data sets, use clusters for queries and visualisations, train machine learning models, and deploy jobs to production so that anyone in an organisation can see and use data to make decisions, all within VS Code.
Uniquely designed to take advantage of full IDE capabilities
Developer teams can enjoy all the comforts of VS Code they are used to while building applications on Databricks. Navigating to function definitions, refactoring, using advanced find-and-replace, and utilising split windows can help drive team-wide efficiency. Teams can also receive code completion on functions and variables, including Databricks-specific objects, to speed up discovery and development.
With VS Code and Databricks, developers can now utilise the software development best practices already set up for the development process, and build and modularise different parts of their program across separate functions and files. Files are local, so teams can use VS Code’s Git tools and the git CLI, while they can also modularise code into files and libraries to encourage code reuse and improve hygiene. VS Code enables developers to use their preferred testing framework to ensure software quality, while integrating with top CI/CD tools to deliver code into production faster.
“I find that the Databricks extension enables developers like me, who prefer an IDE, to be able to readily harness the power of VS Code with native and extensible functions while easily allowing me to run Databricks locally. Very helpful!” – Sam Walker, Data Engineer at Watco
A fully supported IDE experience
The new VS Code extension is directly available through the Visual Studio marketplace to streamline the acquisition experience and to ensure that teams are getting an officially supported and trusted version from Databricks. We will be updating the extension regularly to provide experience and quality improvements for peace of mind, and CIOs can also be assured that their investment is future-proof with added support for new capabilities of the Databricks Lakehouse platform as they get released.
“We consider it a huge advance that will further boost our development process, a milestone that certainly deserves to be celebrated and shared!” – Iago Brandão, MLOps at ViaHub