Enterprise AI
In recent years, artificial intelligence has become less of a buzzword and more of an adopted process across the enterprise. With that, there is a growing need to increase operational efficiency as customer demands arise. AI platforms have become increasingly more sophisticated, and there has become the need to establish guidelines and ownership. In DZone’s 2022 Enterprise AI Trend Report, we explore MLOps, explainability, and how to select the best AI platform for your business. We also share a tutorial on how to create a machine learning service using Spring Boot, and how to deploy AI with an event-driven platform. The goal of this Trend Report is to better inform the developer audience on practical tools and design paradigms, new technologies, and the overall operational impact of AI within the business. This is a technology space that's constantly shifting and evolving. As part of our December 2022 re-launch, we've added new articles pertaining to knowledge graphs, a solutions directory for popular AI tools, and more.
Continuous Integration Patterns and Anti-Patterns
Getting Started With CI/CD Pipeline Security
Every country has its currency and different patterns or ways of displaying monetary amounts. When we appropriately express a number, it is easier to read and comprehend for readers. When you use data from an API or an external resource, it will be in some generic format. For instance, if you are creating a store, you may have data such as pricing. This article will walk you through how to format a number as Currency in JavaScript. Let's dive in! We will be using a random number, such as 17225, as shown in the arrays below: JavaScript const Journals = [ { "id": 01, "name": "Software Development", "price": 100.80, }, { "id": 02, "name": "Introduction to Programming", "price": 1534, }, { "id": 04, "name": "Program or Be Programmed", "price": 17225, } ] Even adding a currency sign does not solve the problem since commas and decimals must be added in the right locations. You would also like each price output to be formatted correctly, dependent on the currency. For example, 17225 would be $17,225.00 (US Dollars), 17,225.00 (Rupees), or €17,225,00 (Euros), depending on your chosen currency, location, and style. You may also use JavaScript's Intl.NumberFormat() function to convert these integers to currencies. JavaScript const price = 17225; let KenyaShilling = new Intl.NumberFormat('en-Ke', { style: 'currency', currency: 'KSH', }); console.log(`The formatted version of {price} is {KenyaShilling.format(price)}`); // The formatted version of 17225 is Ksh17,225.00 Output: Ksh 17,225.00 How to Format Numbers as Currency Using the Intl.NumberFormat() Constructor You may build Intl.NumberFormat objects that enable language-sensitive numerical formatting, such as currency formatting, using the Intl.NumberFormat() constructor. This constructor considers two important factors: locales and options, which are both optional. new Intl.NumberFormat(locales, options) // we can also use Intl.NumberFormat(locales, options) Remember that Intl.NumberFormat() can be used either with or without "new." Both will create a new Intl.NumberFormat instance. When no locale or option is given to the Intl.NumberFormat() constructor will simply format the integer by adding commas. const price = 17225; console.log(new Intl.NumberFormat().format(price)); Output: 17,225 As noted above, you are not looking for standard number formatting. Instead, you want to structure these numbers as currency so that it returns the currency sign with suitable formatting rather than having to write it manually. We can now have a look and explore both parameters. The First Argument: Locales The locale argument is an optional string parameter that could be given. It denotes a particular geographical, political, or cultural territory. It only prepares the number according to the location and does not include currency formatting. const price = 172250; console.log(new Intl.NumberFormat('en-US').format(price)); // 172,250 console.log(new Intl.NumberFormat('en-IN').format(price)); // 1,72,250 console.log(new Intl.NumberFormat('en-DE').format(price)); // 172.250 You will see that the numbers and prices have been formatted regionally based on your location. Let's look at the options parameter now to see how we may change the numbers to represent a currency. The Second Argument: Options (Style, Currency…) This is the major parameter, and you may use this to apply additional formatting, such as currency formatting. This is a JavaScript object that has additional arguments, such as: Style: This specifies the sort of formatting you desire. This includes values such as decimals, currencies, and units. Currency; is an additional choice. You may use this option to indicate the currency to format to, such as USD, CAD, GBP, INR, and many more. JavaScript // format number to US dollar let USDollar = new Intl.NumberFormat('en-US', { style: 'currency', currency: 'USD', }); // format number to British pounds let Pounds = Intl.NumberFormat('en-GB', { style: 'currency', currency: 'GBP', }); // format number to Indian rupee let Rupee = new Intl.NumberFormat('en-IN', { style: 'currency', currency: 'INR', }); // format number to Euro let Euro = Intl.NumberFormat('en-DE', { style: 'currency', currency: 'EUR', }); console.log('Dollars: ' + USDollar.format(price)); // Dollars: $172,250.00 console.log(`Pounds: ${pounds.format(price)}`); // Pounds: £172,250.00 console.log('Rupees: ' + rupee.format(price)); // Rupees: ₹1,72,250.00 console.log(`Euro: ${euro.format(price)}`); // Euro: €172,250.00 The Third Argument: Maximum Significant Digits MaximumSignificantDigits is another option. This allows you to round the price variables based on the number of significant figures you choose. For example, if you change the value to 3, 172,250.00 becomes 172,000. JavaScript let euro= Intl.NumberFormat('en-'Euro, { style: 'currency', currency: 'EUR', maximumSignificantDigits: 3, }); console.log(`Euro: ${euro.format(price)}`); // Euro: £172,000 The scope of this article is just the basics of how to use JavaScript to convert a random number to a currency format. Happy coding!
SQL Server Management Studio (SSMS) is one of the most proven database administration and management tools. Whether you use SQL Server Management Studio for Database Administrators or just for database development, it's a proven tool. We want to increase our capabilities, adapting this tool to our specific needs. For this, we use a range of SSMS add-ins and extensions. We'll discuss two productivity-focused add-ins that help us achieve more in less time. They're a great way to extend the functionality of the SSMS. They make it more flexible and enhance our SQL codings with autocompletion, snippets, and refactoring. These are SSMS Tools Pack and dbForge SQL Complete. Let's first take a brief overview of both products, then look at features that might interest you. It helps you decide whether SQL Complete is better than SQL Server Management Studio. The Overview of dbForge SQL Complete We start with dbForge SQL Complete, available for SSMS and Microsoft Visual Studio. It offers IntelliSense-like context-based code completion, SQL formatting, and smart refactoring with auto-correction of references. This makes coding up to four times faster. It's suitable for your own use, but it helps to form and unify SQL standards for corporate teamwork. This tool has a powerful T-SQL debugger, tab coloring, and document outline. SQL Complete has many features, and a clean interface — all of that make your work more convenient. Pricing: DBForge SQL Complete is available in three editions: Free Basic (Express), Standard, and Professional. The Express edition is a unique offering that comes completely free of charge. There is no other way to extend the code completion functionality in SSMS for free. SQL Complete can also be purchased in a package called dbForge SQL Tools, which includes fifteen essential products that cover nearly any aspect of SQL Server development, management, and administration. The Overview of SSMS Tools Pack While the second contender, SSMS Tools Pack, doesn't come close to being as versatile and powerful as the first contender, it does offer quite a bit of functionality. It is a SQL Server Management Studio plugin that was created to boost the user's productivity. It's easy to use and delivers a handy SQL editor, CRUD procedure generation, snippets, formatting, convenient search with filtering, and SQL execution history. Like SQL Complete, it also includes features that aren't essential yet, like tab coloring and the ability to export to Excel spreadsheets. Pricing: SSMS Tools Pack is a commercial product with licenses available in Small, Large, and Enterprise team packages. A free version is available for one computer for sixty days. Feature Comparison of dbForge SQL Complete and SSMS Tools Pack To make this comparison, we used the latest versions of tools — SQL Complete v6.12.8 and SSMS Tools Pack v5.5.2. Read the feature descriptions carefully. Some may be far more critical for your particular goals than others. Feature dbForgeSQL Complete SSMSTools Pack Compatibility SSMS integration Yes Yes Visual Studio integration Yes No Improved code quality Find invalid objects Yes No CRUD procedure generation Yes Yes Generation of the CREATE/ALTER script for server objects Yes No Execution Plan Analyzer No Yes Renaming of objects, variables, and aliases Yes No T-SQL Debugger Yes No Run on multiple targets Yes Yes Safe work with document environment and databases Various options for executing statements Yes Yes Execution warnings Yes Yes Execution notifications Yes No Transaction reminder Yes Yes Run At Status Bar Element No Yes Tab coloring Yes Yes Custom SSMS main window title Yes Yes Execution history of SQL statements Yes Yes Tab management Yes Yes Quick Connect Active SQL Editor Window No Yes Document sessions Yes No Operations with data in the SSMS data grid Results Grid data visualizers Yes No Copy Data As from the SSMS grid to XML, CSV, HTML, and JSON Yes No Copy Results Grid headers (column names + types) Yes No Export to Excel from the SSMS Results Grid No Yes Grid aggregates Yes Yes Find in Results Grid Yes Yes Generate Script As from the SSMS data grid Yes Yes Increased coding productivity Context-sensitive suggestion of object names and keywords Yes No Expand SELECT * Yes Yes Object information Yes No Parameter information Yes No SQL snippets Yes Yes New query template No Yes ‘Go to definition’ for database objects Yes Yes Highlighted occurrences of identifiers Yes No Named regions Yes Yes Document Outline window Yes No Unified SQL standards SQL formatting Yes Yes Multiple predefined formatting profiles Yes No Preservation of the original formatting for any selected piece of code Yes No Command-line interface Yes No Settings Import/Export Settings Wizard Yes Yes Quick search for options Yes No Releases Initial release v1.0 (November 19, 2010) v1.0 (May 1, 2008) Latest release (as of September 2021) v6.12 (September 12, 2022) v5.5 (July 1, 2020) Total quantity of releases 133 41 The Verdict of Comparison of dbForge SQL Complete and SSMS Tools Pack We saw that dbForge SQL Complete has more features than SSMS Tools Pack. It offers more to improve productivity, provides a much wider range of data operations, has noticeably more advanced formatting, and supports a command line interface. You should also be aware that SQL Complete is updated more frequently. If you're looking for an SSMS Tools Pack alternative, then SQL Complete is the best solution for you. It's compatible with both Microsoft Visual Studio and SQL Server Management Studio. It doesn't take a lot of effort to see how effective it is.
Trino is an open-source distributed SQL query engine designed to query large data sets distributed over one or more heterogeneous data sources. Trino was designed to handle data warehousing, ETL, and interactive analytics by large amounts of data and producing reports. Alluxio is an open-source data orchestration platform for large-scale analytics and AI. Alluxio sits between compute frameworks such as Trino and Apache Spark and various storage systems like Amazon S3, Google Cloud Storage, HDFS, and MinIO. This is a tutorial for deploying Alluxio as the caching layer for Trino using the Iceberg connector. Why Do We Need Caching for Trino? A small fraction of the petabytes of data you store is generating business value at any given time. Repeatedly scanning the same data and transferring it over the network consumes time, compute cycles, and resources. This issue is compounded when pulling data from disparate Trino clusters across regions or clouds. In these circumstances, caching solutions can significantly reduce the latency and cost of your queries. Trino has a built-in caching engine, Rubix, in its Hive connector. While this system is convenient as it comes with Trino, it is limited to the Hive connector and has not been maintained since 2020. It also lacks security features and support for additional compute engines. Trino on Alluxio Alluxio connects Trino to various storage systems, providing APIs and a unified namespace for data-driven applications. Alluxio allows Trino to access data regardless of the data source and transparently cache frequently accessed data (e.g., tables commonly used) into Alluxio distributed storage. Using Alluxio Caching via the Iceberg Connector Over MinIO File Storage We’ve created a demo that demonstrates how to configure Alluxio to use write-through caching with MinIO. This is achieved by using the Iceberg connector and making a single change to the location property on the table from the Trino perspective. In this demo, Alluxio is run on separate servers; however, it’s recommended to run it on the same nodes as Trino. This means that all the configurations for Alluxio will be located on the servers where Alluxio runs, while Trino’s configuration remains unaffected. The advantage of running Alluxio externally is that it won’t compete for resources with Trino, but the disadvantage is that data will need to be transferred over the network when reading from Alluxio. It is crucial for performance that Trino and Alluxio are on the same network. To follow this demo, copy the code located here. Trino Configuration Trino is configured identically to a standard Iceberg configuration. Since Alluxio is running external to Trino, the only configuration needed is at query time and not at startup. Alluxio Configuration The configuration for Alluxio can all be set using the alluxio-site.properties file. To keep all configurations colocated on the docker-compose.yml, we are setting them using Java properties via the ALLUXIO_JAVA_OPTS environment variable. This tutorial also refers to the master node as the leader and the workers as followers. Master Configurations alluxio.master.mount.table.root.ufs=s3://alluxio/ The leader exposes ports 19998 and 19999, the latter being the port for the web UI. Worker Configurations alluxio.worker.ramdisk.size=1G alluxio.worker.hostname=alluxio-follower The follower exposes ports 29999 and 30000, and sets up a shared memory used by Alluxio to store data. This is set to 1G via the shm_size property and is referenced from the alluxio.worker.ramdisk.size property. Shared Configurations Between Leader and Follower alluxio.master.hostname=alluxio-leader # Minio configs alluxio.underfs.s3.endpoint=http://minio:9000 alluxio.underfs.s3.disable.dns.buckets=true alluxio.underfs.s3.inherit.acl=false aws.accessKeyId=minio aws.secretKey=minio123 # Demo-only configs alluxio.security.authorization.permission.enabled=false The alluxio.master.hostname needs to be on all nodes, leaders and followers. The majority of shared configs points Alluxio to the underfs, which is MinIO in this case. alluxio.security.authorization.permission.enabled is set to “false” to keep the Docker setup simple. Note: This is not recommended to do in a production or CI/CD environment. Running Services First, you want to start the services. Make sure you are in the trino-getting-started/iceberg/trino-alluxio-iceberg-minio directory. Now, run the following command: docker-compose up -d You should expect to see the following output. Docker may also have to download the Docker images before you see the “Created/Started” messages, so there could be extra output: [+] Running 10/10 ⠿ Network trino-alluxio-iceberg-minio_trino-network Created 0.0s ⠿ Volume "trino-alluxio-iceberg-minio_minio-data" Created 0.0s ⠿ Container trino-alluxio-iceberg-minio-mariadb-1 Started 0.6s ⠿ Container trino-alluxio-iceberg-minio-trino-coordinator-1 Started 0.7s ⠿ Container trino-alluxio-iceberg-minio-alluxio-leader-1 Started 0.9s ⠿ Container minio Started 0.8s ⠿ Container trino-alluxio-iceberg-minio-alluxio-follower-1 Started 1.5s ⠿ Container mc Started 1.4s ⠿ Container trino-alluxio-iceberg-minio-hive-metastore-1 Started Open Trino CLI Once this is complete, you can log into the Trino coordinator node. We will do this by using the exec command and run the trino CLI executable as the command we run on that container. Notice the container id is trino-alluxio-iceberg-minio-trino-coordinator-1, so the command you will run is: <<<<<<< HEAD docker container exec -it trino-alluxio-iceberg-minio-trino-coordinator-1 trino ======= docker container exec -it trino-minio_trino-coordinator_1 trino >>>>>>> alluxio When you start this step, you should see the trino cursor once the startup is complete. It should look like this when it is done: trino> To best understand how this configuration works, let’s create an Iceberg table using a CTAS (CREATE TABLE AS) query that pushes data from one of the TPC connectors into Iceberg that points to MinIO. The TPC connectors generate data on the fly so we can run simple tests like this. First, run a command to show the catalogs to see the tpch and iceberg catalogs since these are what we will use in the CTAS query: SHOW CATALOGS; You should see that the Iceberg catalog is registered. MinIO Buckets and Trino Schemas Upon startup, the following command is executed on an intiailization container that includes the mc CLI for MinIO. This creates a bucket in MinIO called /alluxio, which gives us a location to write our data to and we can tell Trino where to find it: /bin/sh -c " until (/usr/bin/mc config host add minio http://minio:9000 minio minio123) do echo '...waiting...' && sleep 1; done; /usr/bin/mc rm -r --force minio/alluxio; /usr/bin/mc mb minio/alluxio; /usr/bin/mc policy set public minio/alluxio; exit 0; " Note: This bucket will act as the mount point for Alluxio, so the schema directory alluxio://lakehouse/ in Alluxio will map to s3://alluxio/lakehouse/. Querying Trino Let’s move to creating our SCHEMA that points us to the bucket in MinIO and then run our CTAS query. Back in the terminal, create the iceberg.lakehouse SCHEMA. This will be the first call to the metastore to save the location of the schema location in the Alluxio namespace. Notice, we will need to specify the hostname alluxio-leader and port 19998 since we did not set Alluxio as the default file system. Take this into consideration if you want Alluxio caching to be the default usage and transparent to users managing DDL statements: CREATE SCHEMA iceberg.lakehouse WITH (location = 'alluxio://alluxio-leader:19998/lakehouse/'); Now that we have a SCHEMA that references the bucket where we store our tables in Alluxio, which syncs to MinIO, we can create our first table. Optional: To view your queries run, log into the Trino UI and log in using any username (it doesn’t matter since no security is set up). Move the customer data from the tiny generated TPCH data into MinIO using a CTAS query. Run the following query, and if you like, watch it running on the Trino UI: CREATE TABLE iceberg.lakehouse.customer WITH ( format = 'ORC', location = 'alluxio://alluxio-leader:19998/lakehouse/customer/' ) AS SELECT * FROM tpch.tiny.customer; Go to the Alluxio UI and the MinIO UI, and browse the Alluxio and MinIO files. You will now see a lakehouse directory that contains a customer directory that contains the data written by Trino to Alluxio and Alluxio writing it to MinIO. Now, there is a table under Alluxio and MinIO, you can query this data by checking the following: SELECT * FROM iceberg.lakehouse.customer LIMIT 10; How are we sure that Trino is actually reading from Alluxio and not MinIO? Let’s delete the data in MinIO and run the query again just to be sure. Once you delete this data, you should still see data return. Stopping Services Once you complete this tutorial, the resources used for this excercise can be released by runnning the following command: docker-compose down Conclusion At this point, you should have a better understanding of Trino and Alluxio, how to get started with deploying Trino and Alluxio, and how to use Alluxio caching with an Iceberg connector and MinIO file storage. I hope you enjoyed this article. Be sure to like this article and comment if you have any questions!
We're building a Google Photos clone, and testing is damn hard! How do we test that our Java app spawns the correct ImageMagick processes or that the resulting thumbnails are the correct size and indeed thumbnails, not just random pictures of cats? How do we test different ImageMagick versions and operating systems? What’s in the Video 00:00 Intro We start the video with a general overview of what makes testing our Google Photos clone so tricky. As in the last episode, we started extracting thumbnails from images, but we now need a way to test that. As this is done via an external ImageMagick process, we are in for a ride. 01:05 Setting Up JUnit and Writing the First Test Methods First off, we will set up JUnit 5. As we're not using a framework like Spring Boot, it serves as a great exercise to add the minimal set of libraries and configuration that gets us up and running with JUnit. Furthermore, we will write some test method skeletons, while thinking about how we would approach testing our existing code and taking care of test method naming, etc. 04:19 Implementing ImageMagick Version Detection In the last episode, we noticed that running our Java app on different systems leads to unexpected results or just plain errors. That is because different ImageMagick versions offer a different set of APIs that we need to call. Hence, we need to adjust our code to detect the installed ImageMagick version and also add a test method that checks that ImageMagick is indeed installed, before running any tests. 10:32 Testing Trade-Offs As is apparent with detecting ImageMagick versions, the real problem is that to reach 100% test coverage with a variety of operating systems and installed ImageMagick versions, you would need a pretty elaborate CI/CD setup, which we don't have in the scope of this project. So we are discussing the pros and cons of our approach. 12:00 Implementing @EnabledIfImageMagickIsInstalled What we can do, however, is make sure that the rest of our test suite only runs if ImageMagick is installed. Thus, we will write a custom JUnit 5 annotation called EnabledIfImageMagickIsInstalled that you can add to any test methods or even whole classes to enable said behavior. If ImageMagick is not installed, the tests simply will not run instead of display an ugly error message. 16:05 Testing Successful Thumbnail Creation The biggest problem to tackle is: How do we properly assert that thumbnails were created correctly? We will approach this question by testing for ImageMagick's exit code, estimating file sizes, and also loading the image, and making sure it has the correct amount of pixels. All of this with the help of AssertJ and its SoftAssertions to easily combine multiple assertions into one. 23:59 Still Only Works on My Machine Even after having tested our whole workflow, we still need to make sure to call a different ImageMagick API for different versions. We can quickly add that behavior to support IM6 as well as IM7, and we are done. 25:53 Deployment Time to deploy the application to my NAS. And this time around, everything works as expected! 26:20 Final Testing Thoughts We did a fair amount of testing in this episode. Let's sum up all the challenges and pragmatic testing strategies that we learned about. 27:31 What’s Next We'll finish the episode by having a look at what's next: multithreading issues! See you in the next episode.
With hands-on experience in AWS DevOps and Google SRE, I’d like to offer my insights on the comparison of these two systems. Both have proven to be effective in delivering scalable and reliable services for cloud providers. However, improper management can result in non-functional teams and organizations. In this article, I’ll give a brief overview of AWS DevOps and Google SRE, examine when they work best, delve into potential pitfalls to avoid, and provide tips for maximizing the benefits of each. DevOps DevOps is a widely used term with multiple interpretations. In this article, I’ll focus on AWS DevOps, which, according to the AWS blog, merges development and operations teams into a single unit. Under this model, engineers work across the entire application lifecycle, from development to deployment to operations. They possess a wide range of skills rather than being limited to a specific function. As a result, the same engineers who write the code are responsible for running the service, monitoring it, and responding to incidents. In practice, every team may have its own approach, but there is some degree of unification of practices, such as with CI/CD, incident prevention, and blameless post-mortems. Personally, I consider AWS to have the most effective operational culture among all the organizations I’ve worked with. Advantages of the DevOps Approach When DevOps is implemented effectively, it can provide several benefits, especially in the early stages of development. For start-ups looking to bring a new product to market quickly, DevOps can offer speed and agility. Similarly, established companies launching a new service or product can also benefit from the DevOps model. Although the same team operates the system, there may be some specialization, with some team members focusing more on operations and others on development. Over time, as the product matures, teams may split, with a platform team (akin to SRE) working alongside a development team (akin to SWE). However, the integration and overlap of operational activities by the development engineers and deep understanding of the system by the operational engineers remain tight. This tight feedback loop leads to a better understanding of how the system runs, its limitations, and the customer experience by all team members. This, in turn, makes decision-making and iteration cycles faster. This is likely a contributing factor to AWS’ dominance in the market and the large number of offerings it provides. When DevOps Goes Wrong Generally, operations can be divided into three main categories: Service operations Incident prevention Incident response While service operations are often seen as enjoyable by software engineers, incident prevention may not be as engaging, and incident response can become overwhelming, particularly when engineers are responsible for development and operations. The more time they spend on operational tasks, the less time they have for development and the more dissatisfied they become with their job. This can result in a vicious cycle of overworked engineers, high turnover, decreased work quality, and a growing workload for operations. Site Reliability Engineering (SRE) Site Reliability Engineering (SRE) is a discipline developed by Google to improve the reliability and availability of software systems. It involves a dedicated team of SREs who focus solely on these goals, while software engineers (SWEs) handle writing the code. SRE brings a formalized set of principles and terminology, such as Service Level Indicators (SLIs), Service Level Objectives (SLOs), error budgets, toil, and others, to ensure the software is scalable and meets performance standards. Benefits of Site Reliability Engineering When SRE is implemented effectively, it provides a high level of standardization and consistency in measuring customer experience. This approach doesn’t necessarily result in more reliable or performant services, but it ensures that best practices are followed across multiple products. By having dedicated SRE teams, it reduces the burden of operations on the software engineers, who no longer need to deal with operational issues at all hours of the day and night. As a result, software engineers can have a better work-life balance, while the SRE team ensures that operational needs are met in a consistent and efficient manner. When SRE Goes Wrong In the SRE model, software engineers (SWEs) are freed from the operational burden; however, this can result in a lack of exposure to the workings of the system, leading to vague risk assessments and limited understanding of how their code behaves in different conditions. On the other hand, SREs may be overburdened with an excessive number of pages, which can slow down development by becoming overly risk-averse. This, in turn, affects the SWEs who then become risk-averse and struggle to get approvals from SREs. This disconnect between the two teams, with SWEs perceiving the service as a black box and SREs lacking an understanding of the code and intent, can lead to a semi-functioning organization where deploying code to production may take months and the majority of initiatives never see the light of day. Which One Is Better? The answer is not that simple. Neither DevOps nor SRE is inherently better or worse, they both have their own strengths and weaknesses. When it comes to DevOps, it’s crucial to ensure that engineers are not overburdened with operational tasks, and that they have a healthy work-life balance. This can be achieved by proper investment in tooling and a focus on quality output. Additionally, it’s important to strike a balance between development and operations to avoid a situation where either one of the two becomes more dominant and hinders the progress of the other. On the other hand, SRE is designed to alleviate the operational burden from software engineers and protect them from the distractions of incident management and other operational tasks. However, it’s important to avoid a disconnect between the SWEs and SREs and ensure that each team has a comprehensive understanding of the system. Additionally, SREs should not only be focused on operational metrics, but also be interested in delivery and should have skin in the game. In other words, both DevOps and SRE have their own advantages and disadvantages, and the best approach will depend on the needs and culture of your organization. The key is to avoid the pitfalls of each system and strive for a balanced and effective approach to software delivery. Balancing Speed and Stability Balancing speed and stability is a critical aspect in the DevOps vs SRE debate. The approach that a company takes will depend on its stage and goals. Start-ups often prioritize speed and agility to bring their product to market quickly, making DevOps the ideal choice. As the company grows, stability and reliability become more important to maintain customer trust, making SRE a better fit. However, the transition from DevOps to SRE does not mean giving up on the principles of speed and agility. An effective SRE model can still strike a balance between reliability and speed by ensuring close collaboration between SWEs and SREs. The SWEs drive the development process, while the SREs ensure the system is reliable and scalable. Regular hat-swapping rotations and joint operational meetings can keep both teams tight-knit and aligned with delivery and stability goals. This approach offers the best of both worlds solution. Closing Thoughts The choice between DevOps and SRE is not straightforward. The best approach depends on the situation of your company and what it needs. By combining the advantages of both, you can find the sweet spot between speed and stability, ensuring that you keep delivering great software. To make this possible, it’s vital for technology and operations engineers to collaborate closely. Sharing responsibilities and meeting regularly can help keep everyone on the same page, with a focus on delivery and maintaining smooth operations. This can result in both DevOps and SRE working effectively.
As digital transformation reaches more industries, the number of data points generated is growing exponentially. As such, data integration strategies to collect such large volumes of data from different sources in varying formats and structures are now a primary concern for data engineering teams. Traditional approaches to data integration, which have largely focused on curating highly structured data into data warehouses, struggle to deal with the volume and heterogeneity of new data sets. Time series data present an additional layer of complexity. By nature, the value of each time series data point diminishes over time as the granularity of the data loses relevance as it gets stale. So it is crucial for teams to carefully plan data integration strategies into time series databases (TSDBs) to ensure that the analysis reflects the trends and situation in near real-time. In this article, we’ll examine some of the most popular approaches to data integration for TSDBs: ETL (Extract, Transform, Load) ELT (Extract, Load, Transform) Data Streaming with CDC (Change Data Capture) Given the need for real-time insights for time series data, most modern event-driven architectures now implement data streaming with CDC. To illustrate how it works in practice, we will walk through a reference implementation with QuestDB (a fast TSDB) to show that CDC can flexibly handle the needs of a time series data source. Extract, Transform, Load (ETL) ETL is a traditional and popular data integration strategy that involves first transforming the data into a predetermined structure before loading the data into the target system (typically a data warehouse). One of the main advantages of ETL is that it provides the highest degree of customization. Since the data is first extracted to a staging area where it is transformed into a clean, standardized format, ETL systems can handle a wide range of formats and structures. Also, once the data is loaded into the data warehouse, data science teams can run efficient queries and analyses. Finally, given the maturity of the ETL ecosystem, there is a plethora of enterprise-grade tools to choose from. On the other hand, ETL is both time- and resource-intensive to maintain. The logic for sanitizing and transforming the data can be complex and computationally expensive. This is why most ETL systems are typically batch-oriented, only loading the data into the warehouse periodically. As the volume of data and the sources of data grows, this can become a bottleneck. Given these qualities, ETL systems are most used for datasets that require complex transformation logic before analysis. It can also work well for datasets that do not require real-time insights and can be stored for long-term trend analysis. Extract, Load, Transform (ELT) ELT, as the name suggests, loads the data first into the target system (typically a data lake) and performs the transformation within the system itself. Given the responsibilities of the target system to handle both fast loads and transformations, ELT pipelines usually leverage modern, cloud-based data lakes that can deal with the processing requirements. Compared to ETL pipelines, ELT systems can provide more real-time analysis of the data since raw data is ingested and transformed on the fly. Most cloud-based data lakes provide SDKs or endpoints to efficiently ingest data in micro-batches and provide almost limitless scalability. However, ELT is not without downsides. Since transformation is done by the target system, such operations are limited by the capabilities supported by the data lakes. If you need a more complex transformation logic, additional steps may be needed to re-extract data and store it in a more friendly format. For most use cases, ELT is a more efficient data integration strategy than ETL. If your application can leverage cloud-based tools and does not require complex processing, ELT can be a great choice to handle large amounts of data in near real-time. Change Data Capture (CDC) For new projects, teams can plan to utilize TSDBs as one of the target systems in an ETL or ELT pipeline. However, for existing projects, either migrating to or adding a TSDB into the mix can be a challenge. For one, work may be required to either modify or use a new driver/SDK to stream data into the TSDB. Even if the same drivers are supported, data formats may also need to change to fully take advantage of TSDB capabilities. CDC tools can be useful to bridge this gap. CDC is simple in principle: CDC tools such as Debezium continuously monitor changes in the source system and notify your data pipeline whenever there is a change. The application causing the change is often not even aware there is a CDC process listening on changes. This makes CDC a good fit for integrating new real-time data pipelines into existing architectures because it requires small or no changes in the existing applications. As such, CDC can be used in conjunction with either ETL or ELT pipelines. For example, the source system can be a SQL RDBMS (e.g., MySQL, PostgreSQL, etc) or NoSQL DB (e.g., MongoDB, Casandra), and one of the target systems can be a TSDB along with other data lakes or warehouses. The main advantage of using CDC for data integration is that it provides real-time data replication. Unlike traditional ETL and ELT systems that work with batches, the changes to the source system are continuously streamed into one or more target systems. This can be useful to replicate either a subset or the entire data across multiple databases in near real-time. The target databases may even be in different geographical regions or serve different purposes (i.e., long-term storage vs. real-time analytics). For time series data where the changes in value over time are often most useful, CDC efficiently captures that delta for real-time insights. Reference Implementation for CDC To illustrate how CDC works more concretely, let’s take the reference implementation I wrote about recently to stream stock prices into QuestDB. At a high level, a Java Spring App publishes stock price information into PostgreSQL. Debezium Connector then reads the changes from PostgreSQL and publishes them onto Kafka. On the other side, QuestDB’s Kafka Connector reads from the Kafka topics and streams them onto QuestDB. For a deeper dive, please refer to Change Data Capture with QuestDB and Debezium. In this reference architecture, the Java Spring App is transforming and loading the data onto PostgreSQL before it’s replicated to TSDB for further analysis. If a more ELT-like pipeline is desired, raw data from another provider could have been loaded directly onto PostgreSQL and transformed later in QuestDB as well. The important thing to note with this architecture is that CDC can seamlessly integrate with existing systems. From the application standpoint, it can retain the transactional guarantees of PostgreSQL while adding a new TSDB component down the pipeline. Conclusion Data integration plays an important role for organizations that use a TSDB to store and analyze time series data. In this article, we looked at some advantages and disadvantages of using ETL or ELT. We also examined how CDC can be used in conjunction with those pipelines to provide real-time replication into TSDBs. Given the special qualities of time series data, using a TSDB to properly store and analyze them is important. If you are starting fresh, look to build an ELT pipeline to stream data into a TSDB. To integrate with an existing system, look at utilizing CDC tools to limit the disruption to the current architecture.
What was once a pipe dream is now a reality: advances in technology over the past decade have allowed businesses to harness the power of real-time data. However, while over 80% of businesses say transforming to a real-time enterprise is critical to meeting customer expectations, only 12% have optimized their processes for real-time customer experiences, according to 451 Research data. Real-time data is not just a nice-to-have, but a must-have for businesses to stay afloat in the “right now” economy. There are still some clear use cases for batch; for example, payroll or billing typically involve processing a large number of transactions on a regular basis. Batch processing allows companies to efficiently process payroll transactions in a single batch rather than processing them individually in real time. However, more companies are switching to real-time in cases where getting and analyzing continuous data really matters. In this article, we’ll evaluate batch and real-time analytics, look at why switching from batch processing to real-time can prove useful, and how to move from a traditional tech stack to one that supports real-time analytics. Let’s dive in. The Problems With Batch Data Processing Batch analytics refers to the processing and analysis of a high volume of data that has already been stored for a period of time. For example, companies process their financial reports on a quarterly and monthly basis. As computers became more powerful, though, batch systems started delivering data faster and faster — first in weeks, then days, then hours. With batch data processing, you might end up with data inconsistencies, as it may not reflect the most up-to-date state of the data. For example, if a batch process is run at night to update a database, any changes made to the data during the day will not be reflected in the updated database until the next batch process is run. In nearly every case, it’s more valuable to have an answer now than it is to have an answer next week. In fact, data can lose its value even when it’s just a couple of milliseconds or microseconds old. Take financial instruments: it pays to know the current price of one down to the nanosecond. What’s most important today is your data’s freshness, latency, and value. We’ve come to the point of needing data as it happens, meaning real-time is now the norm. Newer technologies like Apache Kafka® have replaced the traditional extraction, transformation, and loading (ETL) process. Now that companies can deliver huge amounts of data in near real-time, there’s less need for batch processing and delays with delivery. We’ve come this far and there’s no going back with user and customer experience. There’s less and less room for slow analytics when nearly every industry requires real-time decision-making. While real-time used to be a specialized version of batch, batch is becoming a specialized version of real-time. And there are risks to not taking advantage of real-time analytics. Why Moving From Batch To Real-Time Analytics Is a Good Idea Build Resilient Data Pipelines Since we’ve come to rely on real-time analytics, resilience is absolutely paramount. Real-time data pipelines are resilient at heart, meaning they can easily adapt in the event of failures. While real-time data processing is more technically challenging, it's much easier to change a real-time calculation. With batch, your master file might not be up to date and it’s risky if you mess up. Even minor errors, like typos, can bring your batch process to a halt. If you collect data for a few hours and one batch fails, then the next one will be double the size. If your machine isn’t big enough to store all the data, then you need to scale your machines, which can be quite chaotic. With real-time, information is always up to date and it’s much easier to detect anomalies. Companies are moving away from passive risk mitigation to active risk management thanks to real-time analytics. For example, you can detect fraud signals in real-time and stop fraudulent transactions from completing. By switching to real-time, you’re better placed to depend on the accuracy and security of your data. And, of course, your processing time will also be faster. Your Competitors Are Already Riding the Real-Time Wave Once a company goes real-time, everyone else has to play catch up. Whether you like it or not, real-time is the direction the world is going. By 2025, nearly 30% of all data will be consumed in real-time, and the transformation is already well underway. You’ve likely heard the famous quote attributed to Henry Ford: “If I had asked my customers what they wanted, they would have said faster horses.” Not transitioning to real-time is the equivalent of saying, “We're gonna stick with horses.” By sticking with batch, you are choosing to fight a much harder battle than you need to. Moving to real-time analytics helps you gain a competitive advantage and make new discoveries that could help grow your business. Efficiency Will Skyrocket Companies using real-time analytics see huge efficiency gains. Consider how much real-time has already driven a boom in productivity: we wouldn’t have food delivery apps, the gig economy, or Uber without real-time systems. Sure, ordering a taxi for tomorrow is great, but it's not the same as ordering one right when you need it. With real-time analytics, warehouses can streamline their operations and monitor the conditions of their goods. They can ensure factories won’t run out of equipment or raw materials at the wrong time and predict when a machine is about to overheat so that they can divert some workload to a different factory. Or let’s say you have a pharmaceutical supply chain and you notice one product is out of stock in Germany — but just over the border in Switzerland, you have a surplus. Rather than only finding out about the surplus in Switzerland later, real-time analytics allow you to send the products to Germany the moment they’re needed there. When it comes to efficiency, the power of real-time analytics benefits nearly every industry. Increase Revenue 71% of technology leaders agree that they can tie their revenue growth directly to real-time data, according to the State of the Data Race 2022 report. While batch processing comes with a lower up-front investment, companies can actually cut costs with real-time since aggregating and joining data before ingestion requires less storage and processing. You can also perform serverless computing with real-time, where you only pay for compute time when you need it. Tools like AWS Lambda® allow you to cut costs and run applications during times of peak demand without crashing or over-provisioning resources. The main cost with real-time is the opportunity cost to set it up, and the return on investment is potentially huge. Major payment companies, for instance, save millions per day in revenue potentially lost to fraud with the help of real-time analytics. E-commerce companies can also increase their average cart transaction size with personalized recommendations, and reduce the number of abandoned carts with reminder emails. Every second wasted can cost millions for today’s data-driven production lines. For example, if you have a forging production line, you can’t afford to make slow decisions. The metal equipment has to stay hot, and if it gets cold, you can lose millions of dollars per second. You don’t want to make the wrong call, have to shut down production for 10 minutes, and suddenly find yourself losing hundreds of millions of dollars due to delayed data collection. Empower Data Analysts To Do Their Jobs Well Real-time data empowers data analysts with a more complete and accurate picture of the data they are working with, which can be very helpful in their jobs. If you've spent thousands of hours and millions of dollars a year hiring a team of smart people whose jobs are to analyze your data and share the results, why would you limit them to only considering the situation every 20 minutes? How To Move From Batch To Real-Time Analytics Keep in mind that when switching from batch to real-time, you don’t lose a thing. You’re not trading in batch for real-time. Instead, you’re starting at a better place with a real-time system, and you could always add your batch system back on top if you still need it. To make the switch to a real-time analytics pipeline, use open-source tools like Apache Kafka®, Apache Flink®, and ClickHouse® (KFC). The KFC stack allows you to build up a robust, scalable architecture for getting the most from your data, whether it be batched ETL or real-time metrics. By using real-time tools, you can denormalize the data into a database management system (DBMS) like ClickHouse to allow high-speed access to the joined-up data. Many other tools are moving in the direction of real-time. Materialize, for example, offers a distributed streaming database that enables immediate, widespread adoption of real-time data for applications, business functions, and other data products. Israeli startup Firebolt allows you to deliver sub-second and highly concurrent analytics experiences over big and granular data. Ultimately, moving to real-time does require more than just adopting new tools. It requires a change of mindset. Companies need to modernize their data architectures to move at machine speed rather than people-speed. The Real-Time Tide Will Keep Rising When the Internet first arose some people thought it was just a passing fad or an overhyped idea that would fizzle out. But look where we are today. The same thing is happening now with real-time analytics. While real-time technology will no doubt change in the coming years, the process itself isn’t going away. Rather, it will keep evolving to become even faster. So, will you be left behind or will you switch to real-time?
Application Dependency Mapping is the process of creating a graphical representation of the relationships and dependencies between different components of a software application. This includes dependencies between modules, libraries, services, and databases. It helps to understand the impact of changes in one component on other parts of the application and aids in troubleshooting, testing, and deployment. Software Dependency Risks Dependencies are often necessary for building complex software applications. However, development teams should be mindful of dependencies and seek to minimize their number and complexity for several reasons: Security vulnerabilities: Dependencies can introduce security threats and vulnerabilities into an application. Keeping track of and updating dependencies can be time-consuming and difficult. Compatibility issues: Dependencies can cause compatibility problems if their versions are not managed properly. Maintenance overhead: Maintaining a large number of dependencies can be a significant overhead for the development team, especially if they need to be updated frequently. Performance impact: Dependencies can slow down the performance of an application, especially if they are not optimized. Therefore, it's important for the development team to carefully map out applications and their dependencies, keep them up-to-date, and avoid using unnecessary dependencies. Application security testing can also help identify security vulnerabilities in dependencies and remediate them. Types of Software Dependencies Functional Functional dependencies are a type of software dependencies that are required for the proper functioning of a software application. These dependencies define the relationships between different components of the software and ensure that the components work together to deliver the desired functionality. For example, a software component may depend on a specific library to perform a specific task, such as connecting to a database, performing a calculation, or processing data. The library may provide a specific function or set of functions that the component needs to perform its task. If the library is unavailable or the wrong version, the component may not be able to perform its task correctly. Functional dependencies are important to consider when developing and deploying software because they can impact the functionality and usability of the software. It's important to understand the dependencies between different components of the software and to manage these dependencies effectively in order to ensure that the software works as expected. This can involve tracking the dependencies, managing version compatibility, and updating dependencies when necessary. Development and Testing Development and testing dependencies are software dependencies that are required during the development and testing phases of software development but are not required in the final deployed version. For example, a developer may use a testing library, such as JUnit or TestNG, to write automated tests for the software. This testing library is only required during development and testing but is not needed when the software is deployed. Similarly, a developer may use a build tool, such as Gradle or Maven, to manage the dependencies and build the software. This build tool is only required during development and testing but is not needed when the software is deployed. Development and testing dependencies are important to consider because they can impact the development and testing process and can add complexity to the software. It's important to understand and manage these dependencies effectively in order to ensure that the software can be developed, tested, and deployed effectively. This can involve tracking the dependencies, managing version compatibility, and updating dependencies when necessary. Additionally, it's important to ensure that development and testing dependencies are not included in the final deployed version of the software in order to minimize the size and complexity of the deployed software. Operational and Non-Functional Operational dependencies are dependencies that are required for the deployment and operation of the software. For example, an application may depend on a specific version of an operating system, a specific version of a web server, or a specific version of a database. These dependencies ensure that the software can be deployed and run in the desired environment. Non-functional dependencies, on the other hand, are dependencies that relate to the non-functional aspects of the software, such as performance, security, and scalability. For example, an application may depend on a specific version of a database in order to meet performance requirements or may depend on a specific security library in order to ensure that the application is secure. It's important to understand and manage both operational and non-functional dependencies effectively in order to ensure that the software can be deployed and run as expected. This can involve tracking the dependencies, managing version compatibility, and updating dependencies when necessary. Additionally, it's important to ensure that non-functional dependencies are configured correctly in order to meet the desired performance, security, and scalability requirements. 5 Benefits of Application Mapping for Software Projects Improved Understanding of the Project One of the primary benefits of application mapping is that it helps team members better understand the system as a whole. The visual representation of the relationships and interactions between different components can provide a clear picture of how the system operates, making it easier to identify areas for improvement or optimization. This can be especially useful for new team members, who can quickly get up to speed on the system without having to spend a lot of time reading through documentation or trying to decipher complex code. Facilitated Collaboration Another benefit of application mapping is that it can be used as a tool for communication and collaboration between different stakeholders involved in the software project. By providing a visual representation of the system, application mapping can help to foster a shared understanding between developers, business stakeholders, and other stakeholders, improving collaboration and reducing misunderstandings. Early Identification of Problems Application mapping can also help to identify potential issues early in the project before they become significant problems. By mapping out the relationships between different components, it is possible to identify areas where conflicts or dependencies could cause problems down the line. This allows teams to address these issues before they become major roadblocks, saving time and reducing the risk of delays in the project. Increased Efficiency Another benefit of application mapping is that it can help to optimize workflows and processes, reducing duplication and improving the efficiency of the overall system. By mapping out the flow of data and interactions between different components, it is possible to identify areas where processes can be streamlined or made more efficient, reducing waste and improving performance. Better Decision-Making Application mapping can be used to make informed decisions about future development and changes to the system. By allowing teams to understand the potential impact of changes to one part of the system on other parts, application mapping can help to reduce the risk of unintended consequences and ensure that changes are made with a full understanding of their impact on the overall system. This can help to improve the quality of the final product and reduce the risk of costly mistakes. Conclusion In conclusion, application mapping provides a clear and visual representation of the software architecture and the relationships between different components. This information can be used to improve understanding, facilitate collaboration, identify problems early, increase efficiency, and support better decision-making.
As with many engineering problems, there are many ways to build RESTful APIs. Most of the time, when building RESTful APIs, engineers prefer to use frameworks. API frameworks provide an excellent platform for building APIs with most of the components necessary straight out of the box. In this post, we will explore the 10 most popular REST API frameworks for building web APIs. These frameworks span multiple languages and varying levels of complexity and customization. First, let’s dig into some key factors in deciding which framework to begin building with. How To Pick an API Framework The first factor in choosing an API framework is usually deciding which language you want to work in. For many projects, depending on the organization you work with or your experience, choices may be limited. The usual recommendation is to go with a language you are already familiar with since learning a new language and a new framework can lead to less-than-optimal implementations. If you’re already familiar with the language, your main focus can be on understanding the framework and building efficiently. Once the language is decided, you may have multiple choices of frameworks that support your language of choice. At this point, you will need to decide based on what types of functionality you require from your APIs. Some frameworks will have plugins and dependencies that allow for easy integration with other platforms, some may support your use case more precisely, and others may be limited in functionality that you require, automatically disqualifying them. Making sure that your use case and functionalities are supported by the framework is key. Last but not least, you should also consider the learning curve and educational materials and docs available. As a developer, the availability of good documentation and examples are massive factors in how quickly you and your team can scale up your APIs. Before deciding on a framework, browse the documentation, and do a quick search to ensure that you can find examples that can guide you and inform you on how much effort is needed to build APIs in the framework of your choosing. Now that we have a few factors to consider let’s take a look at some popular framework options. Spring Boot Spring Boot is an open-source framework that helps developers build web and mobile apps. Developed by Pivotal Software, Spring Boot is a framework that’s intended to make the original Spring framework more user-friendly. You can easily start using Spring Boot out of the box without spending time configuring any of its libraries. Programming Language: Java Pros: - Quick to load due to enhanced memory allocation - Can be easily configured with XML configurations and annotations - Easy to run since it includes a built-in server Cons: - Not backward compatible with previous Spring projects and no tools to assist with migration - Binary size can be bloated from default dependencies To learn more about Spring Boot framework, you can check out the docs here Ruby on Rails Ruby on Rails was originally developed as an MVC framework, which gives it the name “the startup technology” among developers. The main purpose of the framework is to deliver apps with high performance. The high-performance standards of Ruby on Rails excited developers using Python and PHP, and many of its concepts are replicated in popular Python and PHP frameworks. Programming Language: Ruby Pros: Great framework for rapid development with minimal bugs Open-source with many tools and libraries available Modular design with efficient package management system Cons: Can be difficult to scale compared to other frameworks like Django and Express Limited multi-threading support for some libraries Documentation can be somewhat sparse, especially for 3rd party libraries To learn more about Ruby on Rails framework, you can check out the docs here Flask Flask is a Python framework developed by Armin Ronacher. Flask’s framework is more explicit than Django and is also easier to learn. Flask is based on the Web Server Gateway Interface toolkit and Jinja2 template engine. Programming Language: Python Pros: Built-in development server and fast debugger Integrated support for unit testing RESTful request dispatching WSGI 1.0 compliant Unicode base Cons: Included tools and extensions are lacking, and custom code is often required Security risks Larger implementations more complex to maintain To learn more about Flask framework, you can check out the docs here Django REST Django REST framework is a customizable toolkit that makes it easy to build APIs. It’s based on Danjgo’s class-based views, so it can be an excellent choice if you’re familiar with Django. Programming Language: Python Pros: The web browsable API is a huge win for web developers Developers can authenticate people on their web app with OAuth2. Provide both ORM and non-ORM serialization. Extensive Documentation Easy Deploy Cons: Learning Curve Does not cover Async Serializers are slow and impractical for JSON validation To learn more about Django REST framework. You can check out the docs here Express.Js Express.Js is an open-source framework for Node.js that simplifies the process of development by offering a set of useful tools, features, and plugins. Programming Language: Javascript Pros: Well Documented Scale application quickly Widely used and good community support Cons: Lack of Security Issues in the callbacks Request problems encountered with the middleware system To learn more about Express.Js framework, you can check out the docs here Fastify First created in 2016, Fastify is a web framework that is highly dedicated to providing the best developer experience possible. A powerful plugin architecture and minimal overhead also help make this framework a great choice for developers. Programming Language: Javascript Pros: Easy development Performant and highly scalable The low-overhead web framework that grounds this system minimizes operation costs for the entire application. Cons: Lack of Documentation and community support Not readily used in the industry To learn more about Fastify framework, you can check out the docs here Play Framework Play is a web application framework for creating modern, robust applications using Scala and Java. Based on Dynamic Types, Play integrates the components and APIs required for modern web application development. Programming Language: Java, Scala Pros: Intuitive User Interface Testing the Application simplified Faster development on multiple projects Cons: Steep Learning Curve Too many plug-ins which are unstable Maybe it doesn’t offer any features for backward compatibility. To learn more about Play framework, you can check out the docs here Gin Gin is a fast framework for building web applications and microservices in the programming language Go. It provides a martini-like API and enables users to build versatile and powerful applications with Go. It contains common functionalities used in web development frameworks such as routing, middleware support, rendering, etc. Programming Language: Golang Pros: Performance Easy to track HTTP method status code Easy JSON validation Crash-free Cons: Lack of documentation Syntax not concise To learn more about Gin framework, you can check out the docs here Phoenix Phoenix is written in Elixir and works to implement the MVC pattern. It will seem similar to frameworks like Ruby on Rails and Django. One interesting thing about Phoenix is that it has channels for real-time features which pre-compile templates. These templates work quickly, making the site smooth and easy to scroll through. Programming Language: Elixir Pros: Filters data that is safe and efficient Elixir runs on the Erland VM for improved web app performance. Concurrency Cons: Expensive Processing Speed Prior Erlang knowledge required To learn more about Phoenix framework, you can check out the docs here Fast API Fast API is a web framework for developing RESTful APIs in Python. It fully supports asynchronous programming, so it can run with product servers such as Uvicorn and Hypercorn. It has support for Fast API into popular IDEs, such as JetBrains PyCharm. Programming Language: Python Pros: High Performance Easy to Code with few bugs Short Development time Supports asynchronous programming Cons: Poor Request Validation Does not support singleton instances Main File is crowded To learn more about Fast API framework, you can check out the docs here Adding in API Analytics and Monetization Building an API is only the start. Once your API is built, you’ll want to make sure that you are monitoring and analyzing incoming traffic. By doing this, you can identify potential issues and security flaws and determine how your API is being used. These can all be crucial aspects in growing and supporting your APIs. As your API platform grows, you may be focused on API products. This is making the shift from simply building APIs into the domain of using the API as a business tool. Much like a more formal product, an API product needs to be managed and likely will be monetized. Building revenue from your APIs can be a great way to expand your business’s bottom line. Wrapping Up In this article, we covered 10 of the most popular frameworks for developing RESTful APIs. We looked at a high-level overview and listed some points for consideration. We also discussed some key factors in how to decide on which API framework to use.
The AngularPortfolioMgr project can import the SEC filings of listed companies. The importer class is the FileClientBean and imports the JSON archive from “Kaggle.” The data is provided by year, symbol, and period. Each JSON data set has keys (called concepts) and values with the USD value. For example, IBM’s full-year revenue in 2020 was $13,456. This makes two kinds of searches possible. A search for company data and a search for keys (concepts) over all entries. The components below “Company Query” select the company value year with operators like “=,” “>=,” and “<=” (values less than 1800 are ignored). The symbol search is implemented with an angular autocomplete component that queries the backend for matching symbols. The quarters are in a select component of the available periods. The components below “Available Sec Query Items” provide the Drag’n Drop component container with the items that can be dragged down into the query container. “Term Start” is a mathematical term that means “bracket open” as a logical operator. The term “end” comes from mathematics and refers to a closed bracket. The query item is a query clause of the key (concept). The components below “Sec Query Items” are the search terms in the query. The query components contain the query parameters for the concept and value with their operators for the query term. The terms are created with the bracket open/close wrapper to prefix collections of queries with “and,” and “or,” or “or not,” and “not or” operators. The query parameters and the term structure are checked with a reactive Angular form that enables the search button if they are valid. Creating the Form and the Company Query The create-query.ts class contains the setup for the query: TypeScript @Component({ selector: "app-create-query", templateUrl: "./create-query.component.html", styleUrls: ["./create-query.component.scss"], }) export class CreateQueryComponent implements OnInit, OnDestroy { private subscriptions: Subscription[] = []; private readonly availableInit: MyItem[] = [ ... ]; protected readonly availableItemParams = { ... } as ItemParams; protected readonly queryItemParams = { ... } as ItemParams; protected availableItems: MyItem[] = []; protected queryItems: MyItem[] = [ ... ]; protected queryForm: FormGroup; protected yearOperators: string[] = []; protected quarterQueryItems: string[] = []; protected symbols: Symbol[] = []; protected FormFields = FormFields; protected formStatus = ''; @Output() symbolFinancials = new EventEmitter<SymbolFinancials[]>(); @Output() financialElements = new EventEmitter<FinancialElementExt[]>(); @Output() showSpinner = new EventEmitter<boolean>(); constructor( private fb: FormBuilder, private symbolService: SymbolService, private configService: ConfigService, private financialDataService: FinancialDataService ) { this.queryForm = fb.group( { [FormFields.YearOperator]: "", [FormFields.Year]: [0, Validators.pattern("^\\d*$")], [FormFields.Symbol]: "", [FormFields.Quarter]: [""], [FormFields.QueryItems]: fb.array([]), } , { validators: [this.validateItemTypes()] } ); this.queryItemParams.formArray = this.queryForm.controls[ FormFields.QueryItems ] as FormArray; //delay(0) fixes "NG0100: Expression has changed after it was checked" exception this.queryForm.statusChanges.pipe(delay(0)).subscribe(result => this.formStatus = result); } ngOnInit(): void { this.symbolFinancials.emit([]); this.financialElements.emit([]); this.availableInit.forEach((myItem) => this.availableItems.push(myItem)); this.subscriptions.push( this.queryForm.controls[FormFields.Symbol].valueChanges .pipe( debounceTime(200), switchMap((myValue) => this.symbolService.getSymbolBySymbol(myValue)) ) .subscribe((myValue) => (this.symbols = myValue)) ); this.subscriptions.push( this.configService.getNumberOperators().subscribe((values) => { this.yearOperators = values; this.queryForm.controls[FormFields.YearOperator].patchValue( values.filter((myValue) => myValue === "=")[0] ); }) ); this.subscriptions.push( this.financialDataService .getQuarters() .subscribe( (values) => (this.quarterQueryItems = values.map((myValue) => myValue.quarter)) ) ); } First, there are the arrays for the RxJs subscriptions and the available and query items for Drag’n Drop. The *ItemParams contain the default parameters for the items. The yearOperators and the quarterQueryItems contain the drop-down values. The “symbols” array is updated with values when the user types in characters (in the symbol) autocomplete. The FormFields are an enum with key strings for the local form group. The @Output() EventEmitter provides the search results and activate or deactivate the spinner. The constructor gets the needed services and the FormBuilder injected and then creates the FormGroup with the FormControls and the FormFields. The QueryItems FormArray supports the nested forms in the components of the queryItems array. The validateItemTypes() validator for the term structure validation is added, and the initial parameter is added. At the end, the form status changes are subscribed with delay(0) to update the formStatus property. The ngOnInit() method initializes the available items for Drag’n Drop. The value changes of the symbol autocomplete are subscribed to request the matching symbols from the backend and update the “symbols” property. The numberOperators and the “quarters” are requested off the backend to update the arrays with the selectable values. They are requested off the backend because that enables the backend to add new operators or new periods without changing the frontend. The template looks like this: HTML <div class="container"> <form [formGroup]="queryForm" novalidate> <div> <div class="search-header"> <h2 i18n="@@createQueryCompanyQuery">Company Query</h2> <button mat-raised-button color="primary" [disabled]="!formStatus || formStatus.toLowerCase() != 'valid'" (click)="search()" i18n="@@search" > Search </button> </div> <div class="symbol-financials-container"> <mat-form-field> <mat-label i18n="@@operator">Operator</mat-label> <mat-select [formControlName]="FormFields.YearOperator" name="YearOperator" > <mat-option *ngFor="let item of yearOperators" [value]="item">{{ item }</mat-option> </mat-select> </mat-form-field> <mat-form-field class="form-field"> <mat-label i18n="@@year">Year</mat-label> <input matInput type="text" formControlName="{{ FormFields.Year }" /> </mat-form-field> </div> <div class="symbol-financials-container"> <mat-form-field class="form-field"> <mat-label i18n="@@createQuerySymbol">Symbol</mat-label> <input matInput type="text" [matAutocomplete]="autoSymbol" formControlName="{{ FormFields.Symbol }" i18n-placeholder="@@phSymbol" placeholder="symbol" /> <mat-autocomplete #autoSymbol="matAutocomplete" autoActiveFirstOption> <mat-option *ngFor="let symbol of symbols" [value]="symbol.symbol"> {{ symbol.symbol } </mat-option> </mat-autocomplete> </mat-form-field> <mat-form-field class="form-field"> <mat-label i18n="@@quarter">Quarter</mat-label> <mat-select [formControlName]="FormFields.Quarter" name="Quarter" multiple > <mat-option *ngFor="let item of quarterQueryItems" [value]="item">{{ item }</mat-option> </mat-select> </mat-form-field> </div> </div> ... </div> First, the form gets connected to the formgroup queryForm of the component. Then the search button gets created and is disabled if the component property formStatus, which is updated by the formgroup, is not “valid.” Next, the two <mat-form-field> are created for the selection of the year operator and the year. The options for the operator are provided by the yearOperators property. The input for the year is of type “text” but the reactive form has a regex validator that accepts only decimals. Then, the symbol autocomplete is created, where the “symbols” property provides the returned options. The #autoSymbol template variable connects the input matAutocomplete property with the options. The quarter select component gets its values from the quarterQueryItems property and supports multiple selection of the checkboxes. Drag’n Drop Structure The template of the cdkDropListGroup looks like this: HTML <div cdkDropListGroup> <div class="query-container"> <h2 i18n="@@createQueryAvailableSecQueryItems"> Available Sec Query Items </h2> <h3 i18n="@@createQueryAddQueryItems"> To add a Query Item. Drag it down. </h3> <div cdkDropList [cdkDropListData]="availableItems" class="query-list" (cdkDropListDropped)="drop($event)"> <app-query *ngFor="let item of availableItems" cdkDrag [queryItemType]="item.queryItemType" [baseFormArray]="availableItemParams.formArray" [formArrayIndex]="availableItemParams.formArrayIndex" [showType]="availableItemParams.showType"></app-query> </div> </div> <div class="query-container"> <h2 i18n="@@createQuerySecQueryItems">Sec Query Items</h2> <h3 i18n="@@createQueryRemoveQueryItems"> To remove a Query Item. Drag it up. </h3> <div cdkDropList [cdkDropListData]="queryItems" class="query-list" (cdkDropListDropped)="drop($event)"> <app-query class="query-item" *ngFor="let item of queryItems; let i = index" cdkDrag [queryItemType]="item.queryItemType" [baseFormArray]="queryItemParams.formArray" [formArrayIndex]="i" (removeItem)="removeItem($event)" [showType]="queryItemParams.showType" ></app-query> </div> </div> </div> The cdkDropListGroup div contains the two cdkDropList divs. The items can be dragged and dropped between the droplists availableItems and queryItems and, on dropping, the method drop($event) is called. The droplist divs contain <app-query> components. The search functions of “term start,” “term end,” and “query item type” are provided by angular components. The baseFormarray is a reference to the parent formgroup array, and formArrayIndex is the index where you insert the new subformgroup. The removeItem event emitter provides the query component index that needs to be removed to the removeItem($event) method. If the component is in the queryItems array, the showType attribute turns on the search elements of the components (querItemdParams default configuration). The drop(...) method manages the item transfer between the cdkDropList divs: TypeScript drop(event: CdkDragDrop<MyItem[]>) { if (event.previousContainer === event.container) { moveItemInArray( event.container.data, event.previousIndex, event.currentIndex ); const myFormArrayItem = this.queryForm[ FormFields.QueryItems ].value.splice(event.previousIndex, 1)[0]; this.queryForm[FormFields.QueryItems].value.splice( event.currentIndex, 0, myFormArrayItem ); } else { transferArrayItem( event.previousContainer.data, event.container.data, event.previousIndex, event.currentIndex ); //console.log(event.container.data === this.todo); while (this.availableItems.length > 0) { this.availableItems.pop(); } this.availableInit.forEach((myItem) => this.availableItems.push(myItem)); } } First, the method checks if the event.container has been moved inside the container. That is handled by the Angular Components function moveItemInArray(...) and the fromgrouparray entries are updated. A transfer between cdkDropList divs is managed by the Angular Components function transferArrayItem(...). The availableItems are always reset to their initial content and show one item of each queryItemType. The adding and removing of subformgroups from the formgroup array is managed in the query component. Query Component The template of the query component contains the <mat-form-fields> for the queryItemType. They are implemented in the same manner as the create-query template. The component looks like this: TypeScript @Component({ selector: "app-query", templateUrl: "./query.component.html", styleUrls: ["./query.component.scss"], }) export class QueryComponent implements OnInit, OnDestroy { protected readonly containsOperator = "*=*"; @Input() public baseFormArray: FormArray; @Input() public formArrayIndex: number; @Input() public queryItemType: ItemType; @Output() public removeItem = new EventEmitter<number>(); private _showType: boolean; protected termQueryItems: string[] = []; protected stringQueryItems: string[] = []; protected numberQueryItems: string[] = []; protected concepts: FeConcept[] = []; protected QueryFormFields = QueryFormFields; protected itemFormGroup: FormGroup; protected ItemType = ItemType; private subscriptions: Subscription[] = []; constructor( private fb: FormBuilder, private configService: ConfigService, private financialDataService: FinancialDataService ) { this.itemFormGroup = fb.group( { [QueryFormFields.QueryOperator]: "", [QueryFormFields.ConceptOperator]: "", [QueryFormFields.Concept]: ["", [Validators.required]], [QueryFormFields.NumberOperator]: "", [QueryFormFields.NumberValue]: [ 0, [ Validators.required, Validators.pattern("^[+-]?(\\d+[\\,\\.])*\\d+$"), ], ], [QueryFormFields.ItemType]: ItemType.Query, } ); } This is the QueryComponent with the baseFormArray of the parent to add the itemFormGroup at the formArrayIndex. The queryItemType switches the query elements on or off. The removeItem event emitter provides the index of the component to remove from the parent component. The termQueryItems, stringQueryItems, and numberQueryItems are the select options of their components. The feConcepts are the autocomplete options for the concept. The constructor gets the FromBuilder and the needed services injected. The itemFormGroup of the component is created with the formbuilder. The QueryFormFields.Concept and the QueryFormFields.NumberValue get their validators. Query Component Init The component initialization looks like this: TypeScript ngOnInit(): void { this.subscriptions.push( this.itemFormGroup.controls[QueryFormFields.Concept].valueChanges .pipe(debounceTime(200)) .subscribe((myValue) => this.financialDataService .getConcepts() .subscribe( (myConceptList) => (this.concepts = myConceptList.filter((myConcept) => FinancialsDataUtils.compareStrings( myConcept.concept, myValue, this.itemFormGroup.controls[QueryFormFields.ConceptOperator] .value ) )) ) ) ); this.itemFormGroup.controls[QueryFormFields.ItemType].patchValue( this.queryItemType ); if ( this.queryItemType === ItemType.TermStart || this.queryItemType === ItemType.TermEnd ) { this.itemFormGroup.controls[QueryFormFields.ConceptOperator].patchValue( this.containsOperator ); ... } //make service caching work if (this.formArrayIndex === 0) { this.getOperators(0); } else { this.getOperators(400); } } private getOperators(delayMillis: number): void { setTimeout(() => { ... this.subscriptions.push( this.configService.getStringOperators().subscribe((values) => { this.stringQueryItems = values; this.itemFormGroup.controls[ QueryFormFields.ConceptOperator ].patchValue( values.filter((myValue) => this.containsOperator === myValue)[0] ); }) ); ... }, delayMillis); } First, the QueryFormFields.Concept form control value changes are subscribed to request (with a debounce) the matching concepts from the backend service. The results are filtered with compareStrings(...) and QueryFormFields.ConceptOperator (default is “contains”). Then, it is checked if the queryItemType is TermStart or TermEnd to set default values in their form controls. Then, the getOperators(...) method is called to get the operator values of the backend service. The backend services cache the values of the operators to load them only once, and use the cache after that. The first array entry requests the values from the backend, and the other entries wait for 400 ms to wait for the responses and use the cache. The getOperators(...) method uses setTimeout(...) for the requested delay. Then, the configService method getStringOperators() is called and the subscription is pushed onto the “subscriptions” array. The results are put in the stringQueryItems property for the select options. The result value that matches the containsOperator constant is patched into the operator value of the formcontrol as the default value. All operator values are requested concurrently. Query Component Type Switch If the component is dropped in a new droplist, the form array entry needs an update. That is done in the showType(…) setter: TypeScript @Input() set showType(showType: boolean) { this._showType = showType; if (!this.showType) { const formIndex = this?.baseFormArray?.controls?.findIndex( (myControl) => myControl === this.itemFormGroup ) || -1; if (formIndex >= 0) { this.baseFormArray.insert(this.formArrayIndex, this.itemFormGroup); } } else { const formIndex = this?.baseFormArray?.controls?.findIndex( (myControl) => myControl === this.itemFormGroup ) || -1; if (formIndex >= 0) { this.baseFormArray.removeAt(formIndex); } } } If the item has been added to the queryItems, the showType(…) setter sets the property and adds the itemFormGroup to the baseFormArray. The setter removes the itemFormGroup from the baseFormArray if the item has been removed from the querItems. Creating Search Request To create a search request, the search() method is used: TypeScript public search(): void { //console.log(this.queryForm.controls[FormFields.QueryItems].value); const symbolFinancialsParams = { yearFilter: { operation: this.queryForm.controls[FormFields.YearOperator].value, value: !this.queryForm.controls[FormFields.Year].value ? 0 : parseInt(this.queryForm.controls[FormFields.Year].value), } as FilterNumber, quarters: !this.queryForm.controls[FormFields.Quarter].value ? [] : this.queryForm.controls[FormFields.Quarter].value, symbol: this.queryForm.controls[FormFields.Symbol].value, financialElementParams: !!this.queryForm.controls[FormFields.QueryItems] ?.value?.length ? this.queryForm.controls[FormFields.QueryItems].value.map( (myFormGroup) => this.createFinancialElementParam(myFormGroup) ) : [], } as SymbolFinancialsQueryParams; //console.log(symbolFinancials); this.showSpinner.emit(true); this.financialDataService .postSymbolFinancialsParam(symbolFinancialsParams) .subscribe((result) => { this.processQueryResult(result, symbolFinancialsParams); this.showSpinner.emit(false); }); } private createFinancialElementParam( formGroup: FormGroup ): FinancialElementParams { //console.log(formGroup); return { conceptFilter: { operation: formGroup[QueryFormFields.ConceptOperator], value: formGroup[QueryFormFields.Concept], }, valueFilter: { operation: formGroup[QueryFormFields.NumberOperator], value: formGroup[QueryFormFields.NumberValue], }, operation: formGroup[QueryFormFields.QueryOperator], termType: formGroup[QueryFormFields.ItemType], } as FinancialElementParams; } The symbolFinancialsParams object is created from the values of the queryForm formgroup or the default value is set. The FormFields.QueryItems FormArray is mapped with the createFinancialElementParam(...) method. The createFinancialElementParam(...) method creates conceptFilter and valueFilter objects with their operations and values for filtering. The termOperation and termType are set in the symbolFinancialsParams object, too. Then, the finanicalDataService.postSymbolFinancialsParam(...) method posts the object to the server and subscribes to the result. During the latency of the request, the spinner of the parent component is shown. Conclusion The Angular Components library support for Drag’n Drop is very good. That makes the implementation much easier. The reactive forms of Angular enable flexible form checking that includes subcomponents with their own FormGroups. The custom validation functions allow the logical structure of the terms to be checked. Due to the features of the Angular framework and the Angular Components Library, the implementation needed surprisingly little code.
Breaking Down DevOps: An Easy-to-Understand Introduction
February 14, 2023 by CORE
Powering Manufacturing With MLOps
February 13, 2023 by
Don’t Mess With the Definition of Done
February 13, 2023 by CORE
Powering Manufacturing With MLOps
February 13, 2023 by
Mobile Test Automation Framework: 10 Mistakes You Can Make
February 13, 2023 by
Connection Pooling With BoneCP, DBCP, and C3PO [Video Tutorials]
February 13, 2023 by
Powering Manufacturing With MLOps
February 13, 2023 by
Mobile Test Automation Framework: 10 Mistakes You Can Make
February 13, 2023 by
Authenticate Users via Face Recognition on Your Website or App
February 13, 2023 by
Mobile Test Automation Framework: 10 Mistakes You Can Make
February 13, 2023 by
Connection Pooling With BoneCP, DBCP, and C3PO [Video Tutorials]
February 13, 2023 by
Authenticate Users via Face Recognition on Your Website or App
February 13, 2023 by
Breaking Down DevOps: An Easy-to-Understand Introduction
February 14, 2023 by CORE
Powering Manufacturing With MLOps
February 13, 2023 by
Mobile Test Automation Framework: 10 Mistakes You Can Make
February 13, 2023 by
Powering Manufacturing With MLOps
February 13, 2023 by
Mobile Test Automation Framework: 10 Mistakes You Can Make
February 13, 2023 by
Connection Pooling With BoneCP, DBCP, and C3PO [Video Tutorials]
February 13, 2023 by