Cloud Technology – Rōnin Consulting https://www.ronin.consulting Expert Engineers Delivering Superior Software Thu, 23 Apr 2026 17:55:14 +0000 en-US hourly 1 https://wordpress.org/?v=7.0 https://www.ronin.consulting/wp-content/uploads/2022/01/cropped-Logo-Red-100x100-1-32x32.png Cloud Technology – Rōnin Consulting https://www.ronin.consulting 32 32 An Essential Guide to AI Hosting: Finding the Right Platform https://www.ronin.consulting/cloud-technology/guide-to-ai-hosting/ Tue, 13 Aug 2024 21:28:47 +0000 https://www.ronin.consulting/?p=1647

Running AI models requires substantial hardware resources, often beyond the capacity of standard servers or virtual machines. To address this, enterprise software can leverage AI models hosted in the cloud or on specialized on-premises machines.

Over the past year, AI hosting has grown tremendously, not only in model capabilities and pricing, but also in how cloud providers are baking agentic features, governance, and observability directly into their platforms.

Many of our clients seek to incorporate AI into their businesses, and we’ve been approached multiple times for guidance on selecting the best hosting model, starting with whether private or public AI is the right fit before they even get to hosting

Drawing on our years of experience in the cloud, we are well-equipped to help clients choose the right AI hosting platform. Whether it’s scalability, cost-effectiveness, or specialized features, we’ve successfully guided clients in selecting the hosting platform that best aligns with their business goals.

To help address common questions, we created this primer on popular hosting platforms. Below is a brief overview of some of the more popular cloud and on-premises AI hosting platforms we work with:   

Popular Cloud AI Hosting Platforms

Azure AI Foundry Microsoft Azure 10 01 2025 04 01 PM

Azure AI Foundry

Azure AI Foundry (formerly Azure AI Studio) is Microsoft’s unified platform for building, deploying, and governing AI solutions. Beyond hosting models, Foundry now supports Agent Factory with governance/observability dashboards, and orchestration tools that let enterprises deploy Agentic AI at scale.

  • Overview: Azure AI Foundry supports the deployment of a wide range of models from a model catalog. It offers a playground to test prompts, fine-tuning support, content filters (violence/hate/etc.), and Prompt Flow (a Logic app-like builder supporting chaining of prompts, logic, other tools, and execution tracing).
  • Models: Almost 2,000 models commercial and open-source models are available, including GPT-5 family (gpt-5, gpt-5-mini, gpt-5-nano) and Sora for video generation.
  • Deployment: Serverless pay-as-you-go (PAYG) and Managed Compute are offered, and adds Model Router (preview) to automatically select the right model per use case.
  • Pricing: Token-based for serverless and per hour for Managed Compute.
  • RAG: Supported through Azure AI Search.
  • Data Privacy: Customer data is not available to other customers, OpenAI, and is not used to train/improve any MS or third-party products or services. OpenAI offers a BAA for HIPAA compliance.
Azure OpenAI hosting platform

Azure OpenAI Service

Azure OpenAI Service is a cloud-based platform that brings the power of OpenAI’s advanced language models to Microsoft Azure’s secure and scalable infrastructure. This service enables developers and businesses to integrate AI capabilities, such as natural language processing and conversational AI, into their applications.  It also offers fine-grained content filter controls and region-locked deployments for compliance. It also benefits from Azure AI Foundry’s orchestration features.

  • Overview: API service accessible in Azure. Offers a playground to test prompts and content filters (violence/hate/etc.).
  • Models: Several GPT flavors with varying context sizes, model sizes, and prices. 
  • Deployment: Can run globally (requests routed to whatever region has capacity, higher throughput limits, latency may vary) or locked to a specific region. 
  • Pricing: Token-based pricing. pay-as-you-go (PAYG) and Provisioned Throughput Units (PTU) are offered (PTU only if you have an MS account team).  
  • RAG: Supported through Azure AI Search.
  • Data Privacy: Customer data is not available to other customers, OpenAI, and not used to train/improve any MS or 3rdparty products or services. Offers a BAA for HIPAA compliance.
AWS Bedrock AI hosting platform

AWS Bedrock

AWS Bedrock is Amazon’s fully managed generative AI service that gives businesses access to a broad library of foundation models and tools without requiring them to manage any infrastructure. It allows organizations to build, scale, and govern AI applications within the secure and flexible AWS environment.

  • Overview: AWS Bedrock provides immediate access to popular foundation models without custom deployment. It includes a playground for prompt testing, support for fine-tuning and customization, Prompt Flows (a visual builder for chaining prompts, conditionals, API calls, and inline code), Agents, for orchestration, and robust governance with auditing. Guardrails now allow policy preview (detect mode), granular enforcement on inputs/outputs, and sensitive information masking. Bedrock is also integrated with SageMaker Unified Studio for unified development., governance, and audibility.
  • Models: The catalog hosts 50+ models, including Amazon’s own Nova family, Anthropic’s Claude 4 and Claude Sonnet 4, and DeepSeek-V3.1 with enhanced reasoning. It also supports open-weight models such as gpt-oss-120B and gpt-oss-20B, along with imports of custom models (Mistral, Flan, LLaMA). Models are continuously updated, with lifecycle and deprecation policies requiring migration to newer versions.
  • Deployment: Models are already running.
  • Pricing: Token-based pricing
  • RAG: Supported through Knowledge Bases.
  • Data Privacy:  Customer data is not available to other customers and not used to train/improve any products or services. It also offers a BAA for HIPAA compliance.
  • Other: Supports a feature called Prompt Flows, which allows users to visually design the chaining together of various prompts to models, conditionals, and other tools. It supports guardrails, which are content filters for violence, hate, explicit, etc. It supports a playground for testing prompts and offers a full auditing of calls.
Vertex AI Hosting Platform

GCP Vertex AI

GCP Vertex AI is Google Cloud’s platform for developing, deploying, and managing machine learning models. It provides a unified interface for building custom models, automating workflows, and leveraging pre-trained models. Vertex AI integrates seamlessly with other Google Cloud services, offering tools for scalable and efficient AI solutions.

  • Overview: Vertex AI offers several popular models, some of which are already running and ready for access without any special deployment needed. It also offers a playground to test prompts and fine-tuning support. With their Gemini and PaLM models, this hosting platform provides content filters (violence/hate/etc.) and grounding (connecting model output to verifiable sources of information to prevent hallucinations).
  • Models: 80-90 of the most popular models.
  • Deployment: Some models are already running (managed APIs), and some must be deployed to a specific machine size.
  • Pricing: Token-based pricing for managed API models, and per-hour pricing for models you deploy.
  • RAG: They provide a reference architecture for you to build it using their document AI technology.
  • Data Privacy:  Customer data is not available to other customers and not used to train/improve any products or services. Offers a BAA for HIPAA compliance.
  • Other: It supports direct Google Colab Enterprise integration, a playground for testing prompts, full call auditing, content filters for Gemini and PaLM models, and Apache Airflow via several operators. Vertex AI continues to expand model availability (Gemini 1.5 family, PaLM 3). Grounding features are more robust, helping enterprises connect outputs to verifiable data.
Hugging face Enterprise AI hosting Platform

Hugging Face Enterprise

Hugging Face Enterprise is a cloud-based platform offering advanced tools for deploying and managing state-of-the-art machine learning models. It provides access to a wide range of pre-trained models, including those for natural language processing and computer vision, with tons of support for customization and fine-tuning.

  • Overview: Hugging Face Enterprise offers many open-source and commercial models and fine-tunings, some of which have a playground to test with.
  • Models:  It has over 800,000 base and fine-tuned models.
  • Deployment: The models must be deployed, but you can choose Hugging Face’s AWS, Azure, GCP instance.
  • Pricing: Pricing is per hour.
  • RAG: Not built in. Instead, RAG models can be deployed, and Python code must be written/run on your environment to invoke them and wire them together with other models.
  • Data Privacy: Customer data is not available to other customers and is not used to train/improve any products or services. It offers a BAA for HIPAA compliance, but it is very expensive.
  • Other: Supports direct Google Colab Enterprise integration. It supports a playground for testing prompts, full auditing of calls and content filters for Gemini and PaLM models. Pricing remains high for HIPAA BAA; however, Hugging Face now offers tighter integration with enterprise observability stacks and continues to expand open-source model hosting.

Our Favorite Alternatives to Cloud Hosting AI Platforms

BentoML AI hosting platform

BentoML

BentoML is an open-source platform designed to deploy, manage, and scale machine learning models across environments such as on-premises, Data Centers, Edge, and embedded offerings.

Overview: BentoML is a development library for building AI applications with Python. It contains everything you need to boot up an open-source model of your choice and make it accessible as an API endpoint for your application. Typically, you would download an open-source model from Hugging Face, package it with BentoML, export it as a Docker image, and run it anywhere you like.

Huggingface TGI AI hosting platform

Hugging Face TGI

Hugging Face TGI (Text Generation Inference) is designed to deploy and manage Hugging Face models across various environments such as on-premises, Data Centers, Edge, and embedded offerings.

Overview: Hugging Face TGI is a development toolkit for deploying and serving LLMs. It has built-in support for buffering multiple API requests and support quantization, token streaming, and telemetry (using Open Telemetry and Prometheus). Specialized versions are available for different GPU lines (Nvidia, AMD, AWS Inferentia). Delivered as a Docker image, you typically boot it up with parameters anywhere you like.

Nvidia Triton AI hosting platform

NVIDIA Triton Inference Server

NVIDIA Triton Inference Server is a powerful platform for deploying and managing large-scale AI models. It supports multiple frameworks and provides a unified interface for model serving.

Overview: Triton Inference Server is an open-source software toolkit developed by Nvidia to serve one or more models of many types concurrently on Nvidia GPUs. It supports model ensembles (allowing multiple models to be chained together), a C and Java API (to link directly with application code), metrics (via Prometheus), and both HTTP and gRPC APIs. It is available as a Docker image.

Which Hosting Solution Will Work for Your Business?

When selecting the best hosting solution for your AI model, several factors must be considered, such as scalability, cost, deployment options, and platform-specific features.  You might also need to consider the business problems AI addresses when choosing which hosting platform works best for you. 

Azure AI Studio, Azure OpenAI Service, AWS Bedrock, and GCP Vertex AI are strong contenders for cloud-based flexibility and ease of use. If you don’t require a BAA, Hugging Face Enterprise is an incredibly cost-effective option. Hugging Face TGI, BentoML, and Nvidia Triton offer robust deployment options for those preferring on-premises solutions.   

Regardless of the hosting platform you want to deploy your AI solution, our Rōnin Consulting team can help. Contact us today to learn how we can set up your team with the right on-premises or cloud platform.  

]]>
Azure Service Bus: Scheduling Reliable Messages https://www.ronin.consulting/cloud-technology/azure/azure-service-bus-scheduling-messages/ Sun, 29 May 2022 23:46:09 +0000 https://www.ronin.consulting/?p=993 Introduction To Azure Service Bus

The Azure Service Bus provides developers a simple way to handle when messages are available on your Azure Service Bus Queues. Let’s take a look at how you schedule messages and a simple example of when you would use it.

How Azure Service Bus It Works

When posting a message to the Azure Service Bus, we need to set the ScheduledEnqueueTimeUtc on the message. As the property name implies, the time set on this property should be in UTC. This makes it very easy to tell the Azure Service Bus when the message should appear in the Queue and be available to any subscribers. Let’s look at a quick code example:

    private async Task SendMessage(string queue, string message, 
            DateTimeOffset? enqueueTime = null)
        {
            await using var queueClient = new ServiceBusClient(_connection);

            ServiceBusMessage messageQueueMessage = new ServiceBusMessage(message);

            if (enqueueTime != null) 
                messageQueueMessage.ScheduledEnqueueTime = enqueueTime.Value;

            await queueClient.CreateSender(queue).SendMessageAsync(messageQueueMessage);
        }

To be clear, this is not the time the message gets processed, it’s just when it becomes available in the Queue. Processing of the message will depend on how many messages are in the queue.

Real-Life Example

Let’s say within your solution you have an Azure App Service hosting your API that posts messages on your Azure Service Bus. Listening to a Queue on the bus is an Azure Function with a Service Bus Trigger. When our Azure Function is triggered, it will send out a text message immediately.

Service Bus Graphic 1

Over a few weeks, the business receives feedback that end users would prefer to receive text messages after a certain time. So easy enough, when your API posts messages to the Azure Service Bus, developers set the ScheduledEnqueueTimeUtc to the appropriate time.

Checking To See If It Works

Now that we have messages scheduled, we can look inside our Azure Service Bus, select the Queue the messages are in, and check out how many are scheduled.

Azure Service Bus Queue

Conclusion

As you can see while looking into Azure Service Bus above it’s not as complicated as you might believe. It is a simple way to handle messages in the queue, and simple to execute. To learn more about Azure Service Bus or any other Azure services, contact a Rōnin today. 

About Rōnin Consulting – Rōnin Consulting provides software engineering and systems integration services for healthcare, financial services, distribution, technology, and other business lines. Services include custom software development and architecture, cloud and hybrid implementations, business analysis, data analysis, and project management for a range of clients from the Fortune 500 to rapidly evolving startups. For more information, please contact us today.

]]>
Quality Reporting with Azure Functions & Twilio SendGrid https://www.ronin.consulting/cloud-technology/azure/reporting-with-azure-functions-twilio-sendgrid/ Mon, 27 Apr 2020 13:59:00 +0000 http://www.ronin.consulting/?p=566 As a developer, especially one who loves designing UI/UX, hearing “We need this Reporting Feature built” brings me so much excitement. The first thing that comes to my mind is typically data visualization. There will be cool graphs, and lots of customization of what a user can select, and when it’s complete, users will sing your name in praise.

But then, reality sets in when you find out what the end result really is–an Excel document. You may think, “Well, that’s not really that fun,” but I had a lot of fun designing and building this feature. So, let’s walk through how it went end-to-end and some reporting with Azure functionality I was able to implement!

Requirements Of This Build

As with everything we build, there is always a set of requirements that must be met. In our case, it was a pretty simple set:

  • The Report being created targets Work Load for the system.
  • The result is an Excel document, which must contain defined columns.
  • Users need a way to generate the report.
  • Users need a way to save configured properties they used to generate the report so they can quickly re-run it later.

So far, straightforward. But let’s view it from a higher level and see how these Azure features need to fit in.

Architectural Guidelines

From the business perspective, the end result is very simple: “We need this,” but from the development perspective, there are quite a few layers to account for. For this particular project we have the following assumptions/limitations:

  • Uses Microsoft Azure.
  • Fits into a Microservice Architecture.
  • Reports may contain large amounts of data.
  • Currently doesn’t leverage a Data Warehouse.
  • Minimal increase to the project’s administrative footprint.
  • The potential for additional reports down the road.

Given the above, I dug into the Azure toolbox to develop the best approach.

The Approach

So, where to begin? First and foremost, I diagramed out what parts and pieces needed to be involved with the Reporting Feature.

To diagram the setup of the feature.

Reporting UI

To get the report kicked off, we needed a set of pages to support the new functionality. In our existing Angular SPA, we would introduce the following views:

  • A grid of saved reports with CRUD operations.
  • A grid of report jobs to act as an audit table.
  • A create/edit report form.

API Management Service

The existing Angular SPA accesses all of its APIs via an Azure Management Service. The Service is a great way to provide access to public-facing APIs while giving you control over requests made to them via policies. Here we can control validating who’s accessing our APIs (typically through JWT), caching, usage quotes and rate limits.

Report/Report Job Controller

With our existing microservice architecture, we already had a good home for these new endpoints. In one of our existing APIs, we added a new Report and Report Job Controller that would be responsible for:

  • Standard CRUD operations.
  • Generating the report.

The Report Job’s generate report endpoint would ultimately handle sending requests via Flurl to a new Azure Function to generate the Report.

Reporting with Azure Functions

The Azure Function would be the brains of the operation. Accessed via an HttpTrigger, we would POST a configuration model, and the Function would operate asynchronously. From there, the Function would need to communicate with any number of APIs to aggregate data, then generate and email the report.

To be able to email the resulting report, we can take advantage of Twilio’s SendGrid integration Azure Functions and Azure have.

So why did this approach work well for us?

How It Fits In

From diagraming how this feature could be built, you can see quickly how well this approach satisfied the business requirements and would seamlessly fit our current architecture.

  • Users had a dedicated area to save and run reports and view the state of those reports.
  • Users would not be sitting there waiting for the request to be completed since the Azure Function was running the report generation asynchronously. This also ensured the Azure Function didn’t timeout since it used an HttpTrigger.
  • We leveraged an existing API and Database.
    • The advantage here is that we didn’t introduce new services that would require extra ARM template management or CI/CD management.
  • We leverage an Azure Function to be the reporting brains.
    • Can hook our APIs in.
    • Can create strategies to generate different reports, allowing flexibility for enhancements in the future.
    • Can take advantage of SendGrind integration to email the reports.
    • Can create an ARM template to handle creating/updating the Function in a target environment.

With the plan laid out, let’s take a look at how easy it was to integrate SendGrid with the Reporting Azure Function.

SendGrid Integration

We include the SendGrid binding in our Azure Function to begin the integration.

 [FunctionName("CreateSystemReport")]
        public static async Task<IActionResult> Run(
            [HttpTrigger(AuthorizationLevel.Anonymous, "post", Route = null)] HttpRequest req,
            [SendGrid] IAsyncCollector<SendGridMessage> messageCollector,
            ILogger log, ExecutionContext executionContext)
        {
            log.LogInformation("Request for Report generation received.");

            string requestBody = await new StreamReader(req.Body).ReadToEndAsync();
            log.LogDebug($"Event Message: {requestBody}");
            
            var context = new ReportProcessorContext(requestBody, messageCollector, log);
            context.ProcessRequest();

            return new NoContentResult();
              
        }

The SendGrid binding does need one piece of information to connect properly, and that’s its API key. One way to do that is by specifying the ApiKey name as a property in the binding.

[SendGrid(ApiKey = "CustomSendGridKeyAppSettingName")] IAsyncCollector&lt;SendGridMessage> messageCollector

Alternatively, you can omit the ApiKey, which will default to looking in your settings file for a property named AzureWebJobsSendGridApiKey.

So, where do you get this key?

Setting up SendGrid in Azure

To acquire your API key for SendGrid, you must first create a SendGrid user account in Azure. This is a very straightforward process and is effectively a sign-up form.

When you’ve completed creating the account, you’ll see that there’s now a new service with a type of SendGrid Account.

Clicking on it will take you to the service overview page in Azure where you can click the Manage button to access the SendGrid portal.

Twilio SendGrid dashboard

SendGrid will walk you through creating an API key and testing it, but in case that isn’t the experience you’re presented with, you can create the keys yourself. Under Settings in the navigation menu, you’ll see an API Keys item.

Twilio SendGrid API Keys

This will list any keys you’ve created so far. For me, I’ve already created a key so you see my existing development one listed. If you click the Create API Key button in the top right, you’ll be presented with a very simple form.

Twilio SendGrid Create API Key

After you fill out the form, you’ll be presented with your API Key. This is the key you can use to integrate your Azure Function with SendGrid.

Sending Emails with Attachments

Now that we have an API Key and we’ve connected SendGrid to our Azure Function, we can send emails. In our case, we want to send an email with an attachment, so let’s see how we did just that.

If you remember from our example above, our binding brought into the Function the IAsyncCollector<SendGridMessage> messageCollector.

 [SendGrid] IAsyncCollector<SendGridMessage> messageCollector

To be able to send an actual email, we just need to create a SendGridMessage object, pass it to your messageCollector and off it goes.

//Create message
var message = new SendGridMessage();
message.AddTo(_request.Recipient);
message.AddContent("text/html", "The Work Load report is attached.");
message.SetFrom(new EmailAddress("notreal@fake.email");
message.SetSubject(baseName);

//Send the message
messageCollector.AddAsync(message);

Super easy. So, how do we add an attachment to the message? Ultimately, we need to send the report along with the email. To do that, we create an appropriately named Attachment object that is handed to the message.

//Create the Workbook of the Report
....

//Save it to a stream  
MemoryStream ms = new MemoryStream();
workBook.SaveAs(ms);

//Create the attachment                    
var reportAttachment = new Attachment()
{
   Content = Convert.ToBase64String(ms.ToArray()),
   Type = "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet",
   Filename = "Report.xlsx",
   Disposition = "inline",
   ContentId = "Report"
};
                  
//Create message
var message = new SendGridMessage();
message.AddTo(_request.Recipient);
message.AddContent("text/html", "The Work Load report is attached.");
message.SetFrom(new EmailAddress("notreal@fake.email");
message.SetSubject(baseName);

//Add the attachment to the message
message.AddAttachments(new List<Attachment>() { reportAttachment });

//Send the message
messageCollector.AddAsync(message);

Once again, super simple.

Conclusion

It was amazing to me how simple the integration between Azure Functions and SendGrid turned out to be. The integration’s simplicity allowed me to focus on more important things related to business needs vs. an extensive implementation. Not to mention, SendGrid is free for up to 25,000 emails and scales for cheap beyond that.

I’m a happy developer with how this turned out, given the requirements and limitations. As always, I’d love to hear about your experiences with Azure Functions, SendGrid, or just technology in general.

About Rōnin Consulting – Rōnin Consulting provides software engineering and systems integration services for healthcare, financial services, distribution, technology, and other business lines. Services include custom software development and architecture, cloud and hybrid implementations, business analysis, data analysis, and project management for a range of clients from the Fortune 500 to rapidly evolving startups. For more information, please contact us today.

]]>
ETL in Azure https://www.ronin.consulting/cloud-technology/azure/etl-in-azure/ Mon, 20 Apr 2020 23:56:57 +0000 http://www.ronin.consulting/?p=561 ETL (Extract, Transform, and Load) has been a part of just about every digital transformation project we have worked on at Ronin. Whether it’s moving data out of an on-premise legacy system to be consumed by newer cloud-based applications or just combining data from disparate systems to be used in a reporting data warehouse, ETL processes are a necessary part of most enterprise solutions. So, what does Azure provide to help us with ETL?

Revising Our Approach

For a few years, the approach we used was relatively bare metal. That is, we leveraged Azure Functions and Web Jobs to connect to data sources, transform the data with custom code, and ultimately push it to a target location. It got the job done for sure, but as you can imagine there is fair amount of boiler plate code. Additionally, the monitoring and scaling implementations were different for each solution. 

Today, every ETL discussion starts with Azure Data Factory. While we had to pass on using early versions of ADF in favor of the bare metal process to really get our work done, that’s not the case anymore. ADF has a ton of ways to ingest data, it scales well, and Data Flow offers a ton of transformation options without writing any custom code.

Our High-Level Decision Guide

Microsoft positions ADF specifically as an Azure service to manage ETL and other integrations at big data scale. While there are many ways to employ ADF for the solution, we’ve specifically found the following questions and answers most useful as our guide:

  • If we only need to perform extraction and loading of data (for example making data from a legacy system available in the cloud), ADF’s basic pipeline activities are sufficient.
  • If we also need transformations, we’ll start out using ADF’s Data Flow features. We have found that the majority of transformations that we need (joins, unions, derivations, pivots, aggregates, etc) can be handled with the Data Flow user interface.
  • If the transformations involve some edge case scenarios, are hard to visualize in a UI, or there is just a comfort level developing it as code, Azure Databricks (ADB) integration can be used to perform these transformation.

Basic Pipeline Activities Approach

The basic workflow in an Azure Data Factory is called a pipeline. A pipeline is an organization of activities (data movement, row iteration, conditionals, basic filtering, etc) against source and target data sets. Directly, it offers little in the way of transformation activities, though you can hook it to Azure Functions or Azure Databricks (see Azure Databricks Approach below) for more advanced cases.

Below is an example basic pipeline I created very quickly using just a few standard activities. This pipeline simply reads an Employee CSV dataset from Azure BLOB storage, filters the records to only those of new employees, then loops over each row and calls a stored procedure to insert each employee record into a SQL Server database table. The resulting output in the bottom frame is from a debug session.

Screen Shot 2020 04 17 at 11.54.41 AM 1024x526 1

For more information on pipelines and available activities, check out https://docs.microsoft.com/en-us/azure/data-factory/concepts-pipelines-activities.

Data Flow Approach

A Data Flow is a visually designed data transformation for use in Azure Data Factory. The Data Flow is designed in ADF, then invoked during a pipeline using a Data Flow Activity. The transformations offered here offer a lot of power and configuration options through an easy to follow interface. Joining and splitting data sets, cleaning data, deriving new columns, filtering and sorting results, and running expression functions on row data are some of the possibilities with Data Flow. All of it is done through simple user interface controls.

Below is an example mapping data flow I created to show just a few of the transformation components that can be used. of the very quickly. This data flow reads HR employee data, contractor data, and billing info from three different systems. It performs some filtering and new column generation, then combines all of these results. Finally, it sorts the results and drops it to a CSV file.

Screen Shot 2020 04 17 at 3.05.39 PM 1024x337 1

For more information on Data Flows, check out https://docs.microsoft.com/en-us/azure/data-factory/concepts-data-flow-overview.

Azure Databricks Approach

There are times when it makes sense to simply write code to perform a data transformation. For example:

  • There’s a weird edge case that the data flow user interface can’t accommodate.
  • There’s going to be a high degree of refactoring (change) needed over time and the data flows will be large. It would be much easier / faster to tweak the code than to try to re-write large data flows.
  • The source data is already exported as enormous amounts of unstructured data into Azure Data Lakes Storage, the file system natively integrated into Azure Databricks.

In these cases, Azure Data Factory pipelines can invoke notebooks in Azure Databricks using a Databricks Notebook activity. Notebooks define Scala, Python, SQL, or Java code to manipulate and query large volumes of data (terabytes) on its specialized Azure Data Lake Storage file system.

In the below example, I created a simple Databricks notebook to read two CSV files that have been dropped into Azure Data Lake Storage. These could have just as easily been Excel, JSON, parquet, or some other file format as long as there is an extension to read them into data frames. This example takes employee and HR files, joins the rows, computes a PTO Remaining column, orders the results, then stores the data back out as a new CSV file. Azure Data Factory would have a pipeline configured with a Databricks Notebook activity to call this notebook, passing the two CSV file names.

Screen Shot 2020 04 17 at 6.28.38 PM 1024x515 1

Note how you can seamlessly switch between languages in a notebook. This one is a Scala based notebook that switches to SQL midway through. The environment is also very interactive for debugging against large amounts of data, a nice feature since the ADF user interface can only show limited amounts of data in its debug sessions.

Interestingly, ADF’s data flows are implemented as generated Scala code running in its own managed Databricks cluster. This all happens behind the scenes, but it explains why it’s actually hard to come up with everyday use cases where data flows aren’t sufficient. For more information on transformations with ADB, check out https://docs.microsoft.com/en-us/azure/azure-databricks/databricks-extract-load-sql-data-warehouse.

Parting Thoughts

As awesome as ADF is, it is true that it’s not always the be-all and end-all for ETL. There are still times when ADF is only part of the solution. For example:

  • The server infrastructure hosting an on-premise or alternate cloud database from which ADF needs to pull can’t host the integration runtime (a requirement for ADF to reach the data). In this case, we may have to design and build interesting ways to get the data out and into Azure, accessible to ADF
  • Though accessible by network to ADF, the source data is contained in a format ADF can’t read. In this case, we may have to build or leverage 3rd party software to extract the data into a digestible format for ADF
  • The source data’s schema is so bonkers, Azure Databricks is necessary to pull off the transformation. However, the client may be unwilling to pay the heft ongoing cost of ADB, or they may not be comfortable supporting it down the road and would rather see a more traditional C# or T-SQL coded solution. In this case, we may fall back to a bare metal approach.

Like most software projects, one-size never fits all. But we highly recommend you give ADF a strong look on your next ETL adventure. We’d love to help you with it!

About Rōnin Consulting – Rōnin Consulting provides software engineering and systems integration services for healthcare, financial services, distribution, technology, and other business lines. Services include custom software development and architecture, cloud and hybrid implementations, business analysis, data analysis, and project management for a range of clients from the Fortune 500 to rapidly evolving startups. For more information, please contact us today.

]]>
Solving the Mapping Problem for Web Service APIs https://www.ronin.consulting/cloud-technology/solving-the-mapping-problem-for-web-service-apis/ Mon, 23 Mar 2020 00:05:00 +0000 http://www.ronin.consulting/?p=536 At Ronin, it’s typical for us to be working on a project that needs to tie multiple systems together. These can be bare metal integrations that work directly against application databases, but more modern integrations will feature some type of web service API (typically REST or SOAP) with the actual payload being JSON or XML data. Today, I thought I’d share a solution I recently used to solve the problem of mapping the payload from one web service API to another.

Mapping API Data

For this particular project, I needed to receive a large web service API payload (a SalesForce event notification with an XML payload), do some interesting internal work (audits, fire off messages to an Azure Service Bus topic, etc.), and finally deliver the payload to another 3rd party application. Since the incoming payload (XML) would be different than the payload accepted by the 3rd party application’s web service API (JSON), payload transformation also needed to occur.

There are many ways to pull off this implementation in Azure (HTTP triggered Azure Function App, Azure App Service API, Azure Logic App, etc.), but for this project, we chose Azure Logic Apps. Using Logic Apps would allow the client to make minor tweaks to the process flow without needing us to come back and write additional C# code, which is a bit of a holy grail for many small IT shops requesting project help.

A

However, the approach did present a problem regarding the payload transformation. For small/simple payloads, it makes sense to use the Logic App JSON creation actions and built-in function expressions, but it gets messy as the payload grows in complexity. Writing some C# code to read a large XML or JSON payload and creating another is relatively simple. This could be placed in an Azure Function App and called from our Logic App, but I’d arrive at a mapping mechanism that would not be tweakable by our client later on.

We used Azure Logic Apps for a previous client to produce EDI flat files. For that implementation, an Azure Integration Account was used to hold maps. These maps described transforming an XML payload to an EDI flat file. That same exact approach would be a bit heavy as it involves navigating custom source XML schemas, using the Microsoft BizTalk mapper (heavy), and in some cases some custom XSLT or BizTalk scriptoid development. 

B

Still, something like this approach that was lighter weight and integrated with Logic Apps would be a solid fit.

Liquid is Solid

After some research, I stumbled upon the perfect solution for this client: Liquid templates.

Azure Logic Apps have a series of actions for data transformation, pulling the transformation definitions, or maps, from an Azure Integration Account. In addition to BizTalk maps, Integration Accounts natively support Liquid templates (https://shopify.github.io/liquid/). Liquid is an open-source template language that works a bit like XSLT. Using Liquid tags (control flow) and filters (output manipulation), intermediate and some complex JSON and XML transformations can be achieved from an Azure Logic App by editing a simple text file and loading it into an Azure Integration Account.

C

For example, suppose I have the following incoming XML payload from web service API (e.g. Sales Force):

<?xml version="1.0" encoding="UTF-8"?>
<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" 
   xmlns:xsd="http://www.w3.org/2001/XMLSchema" 
   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
 <soapenv:Body>
  <notifications xmlns="http://soap.sforce.com/2005/09/outbound">
   <OrganizationId>00DL0000005xwjAMAQ</OrganizationId>
   <ActionId>04kL00000000BCFIA2</ActionId>
   <SessionId>... session id ...</SessionId>
   <EnterpriseUrl>... url ...</EnterpriseUrl>
   <PartnerUrl>... url ...</PartnerUrl>
   <Notification>
    <Id>12345</Id>
    <sObject xsi:type="sf:Beneficiary__c" xmlns:sf="urn:sobject.enterprise.soap.sforce.com">
     <sf:Id>67890</sf:Id>
     <sf:First_Name__c>Susan</sf:First_Name__c>
     <sf:Last_Name__c>Smith</sf:Last_Name __c>
     <sf:Address_City__c>Atlanta</sf:Address_City__c>
     <sf:Address_State__c>GA</sf:Address_State__c>
     <sf:Phone__c>(678) 867-5309</sf:Phone__c>
     <sf:Primary__c>Yes</sf:Primary__c>
     <sf:Contingent__c>No</sf:Contingent__c>
     <sf:DOB__c>2012-05-11T00:00:00.000Z</sf:DOB__c>
     <sf:Allocation__c>80%</sf:Allocation__c>
    </sObject>
   </Notification>
  </notifications>
 </soapenv:Body>
</soapenv:Envelope>

Further, suppose I need to map this to an outgoing JSON payload for a target web service API as follows:

{
  "Id": 67890,
  "First": "Susan",
  "Last": "Smith",
  "City": "Atlanta",
  "State": "GA",
  "Phone": "6788675309",
  "Status": "CNTG",
  "DoB": "05/11/2012",
  "Allocation": 80.0
}

Notice a couple of wrinkles in this sample mapping:

  • I need to strip some formatting from a phone number
  • I need to collapse Primary and Contingent elements to a single JSON value
  • I need to re-format a date to mm/dd/yyyy

Here’s a Liquid template I would write to get this transformation done:

{
  "Id": {{content.Envelope.Body.notifications.Notification.sObject.Id}},
  "First": "{{content.Envelope.Body.notifications.Notification.sObject.First_Name__c}}",
  "Last": "{{content.Envelope.Body.notifications.Notification.sObject.Last_Name__c}}",
  "City": "{{content.Envelope.Body.notifications.Notification.sObject.Address_City__c}}",
  "State": "{{content.Envelope.Body.notifications.Notification.sObject.Address_State__c}}",

{%- capture formattedPhone -%}
  {{content.Envelope.Body.notifications.Notification.sObject.Phone__c | Strip | Remove: "("  
      | Remove: ")" | Remove: "-" | Remove: "." | Remove: " "}}
{%- endcapture -%}
  "Phone": "{{formattedPhone}}",

{%- if content.Envelope.Body.notifications.Notification.sObject.Primary__c == "Yes" -%}
  "Status": "PRIM",
{%- else -%}
  "Status": "CNTG",
{%- endif -%}

{%- capture extractedYear -%}
  {{content.Envelope.Body.notifications.Notification.sObject.DOB__c | Slice: 0, 4 }}
{%- endcapture -%}
{%- capture extractedMonth -%}
  {{content.Envelope.Body.notifications.Notification.sObject.DOB__c | Slice: 5, 2 }}
{%- endcapture -%}
{%- capture extractedDay -%}
  {{content.Envelope.Body.notifications.Notification.sObject.DOB__c | Slice: 8, 2 }}
{%- endcapture -%}
  "DoB": "{{extractedMonth}}/{{extractedDay}}/{{extractedYear}}",

  "Allocation": {{content.Envelope.Body.notifications.Notification.sObject.Allocation__c 
      | Remove: "%"}}
}

This is just a simple text file I typed up in Notepad++ and then uploaded into an Azure Integration Account:

D

The Map Type field is key. Instead of using the output XSLT file from BizTalk mapper, I choose Liquid as the type.

Finally, I used the uploaded Liquid template in the Azure Logic App like so:

E

Real World Web Service Approach

I used this approach to perform all of the XML to JSON payload transformations that were needed for our client’s project. The best part of this approach is its well within their reach to tweak. As more fields are exposed from Sales Force that need to be shuffled down to their target web service API, the client can simply update the map and the JSON parsing.

The approach isn’t without a few issues. A few things I’d consider before using it for another client are:

  • The built-in formatting is currently very limited. I had to write my own formatting markup to rework phone numbers and dates. There are a lot of advanced filters out there that would format strings as I needed, but until Microsoft supports registering these extensions, you’re stuck writing your own.
  • When you do write your own filtering code, the problem is compounded by the fact that you can’t call a block of code over and over. You end up having to cut and paste any complex formatting. Again, this problem should be solved once Microsoft supports registering extensions.

All-in-all, if you are looking for that in-between mapping solution (easily edited, doesn’t need a lot of complex features, etc.), have a look at Liquid templates!

To learn more about or process, or talk to a Rōnin Software development consultant, contact us today!

]]>
CI/CD Pitfalls with Branches and Azure Functions Uncovered https://www.ronin.consulting/cloud-technology/ci-cd-pitfalls-with-branches-and-azure-functions/ Wed, 29 Jan 2020 15:03:46 +0000 http://www.ronin.consulting/?p=327 Intro

Azure Functions are a key part of the Microsoft Cloud computing platform. Focusing on integration and continuous delivery, we provide our clients with excellent results with cloud computing. However, while recently building out some Azure Functions for one of our clients, I encountered an unexpected problem while deploying them. I thought I’d share what I found with some of these CI/CD pitfalls.

The Problem and CI/CD Pitfalls

The problem arose during the setup of the CI/CD pipeline. Like most Azure implementations, we wanted to set up our build and release pipelines using Azure DevOps, which allows us to automatically run unit tests during check-in and automatically deploy code to Azure Function Apps. The setup was extremely smooth with Azure DevOps integration since the codebase was being stored in Azure Repos. The default branch of code (master) deployed perfectly to the Azure Function App.

Azure Function Deploy Pipeline
Azure Function Deploy Pipeline

As smooth as this was, however, I hit a wall when trying to deploy the code from the development branch to a development slot of the Function App. The Azure DevOps deployment logs were not particularly helpful, and web searches turned up little on the problem. After spending a lot of time experimenting, digging through logs on the server, and reading through the project Kudu source code, it became clear there was a problem using Zip Deployments. The reason this was so difficult to track down, and there is not much information out there about is that it’s only a problem under the following conditions:

  • You’re deploying a non-default branch (not master), such as a feature or dev branch
  • The deployment method is using:
    • An Azure Function App’s integrated Deployment Center
    • An Azure DevOps pipeline using an Azure Functions task

Both of these deploy using ZipPushDeploy, which at the time of this writing, counts on the source branch being the default (master).

The Solution

Taking some advice from a fellow business partner and Azure expert Byron McClain, I tried using an Azure App Service Deploy task from the release pipeline. Since this task’s advanced options let you choose Web Deploy over the default Zip Deploy, I was back in business. This is our workaround until the good contributors at Project Kudu can patch the issue.

Azure Function App Service Deploy

For more details about the ticket, go here.
For more details about Azure Functions and Serverless Computing, go here.

Happy Azure Trails!

About Rōnin Consulting – Rōnin Consulting provides software engineering and systems integration services for healthcare, financial services, distribution, technology, and other business lines. Services include custom software development and architecture, cloud and hybrid implementations, business analysis, data analysis, and project management for a range of clients from the Fortune 500 to rapidly evolving startups. For more information, please contact us today.

]]>