===== Where does the latency come from in the web API? =====

By Dmitry Degtyarev (dvdegtyarev@edu.hse.ru)

==== INTRODUCTION ====

The modern world can't be imagined without the use of web-APIs. Due to the increasing demand for cloud computing, organizations are increasingly creating and constructing web services. Local and web applications are driven by data and various functionality provided by third-party developers, as they are the ones they access through the web API [1]. The interface itself represents a kind of instruction, being a ready-made programming language pattern, conditioning the interaction between an application or web browser and a web server.
The development of web-API occurs quite quickly and, we can say, imperceptibly. It is believed that the evolution of web-API is a rapid process due to the changing functionality, as well as the introduction of edits from developers. Thus, developers can now not worry about the functionality of their product, as some of the main options are taken over by other organizations outside their control [2].
But if the development is so lightning fast and not always under control, is it possible that there is a process of "degradation", that makes the interaction between the service and the end user difficult? In this case, such a term as web-API latency is appropriate, which, despite its minimal values, can cause irreparable damage to the company. 
The fact is that in the absence of instant information transfer, the user loses his attention and it will be more and more difficult to retain it in the future. A study conducted by Google has shown that even such an imperceptible change in the response time to a search query, such as from 100 ms to 400 ms, significantly reduces the number of submissions from users [3]. 
This essay will examine and identify causes of web-API latency, knowledge of which will help to preserve the reputation of an application that has encountered these difficulties.
==== WHAT IS RESPONSE TIME AND LATENCY ====
The quality of interaction with an end user or client depends significantly on API availability and performance, as these factors are related to the shapes of the interaction endpoints [4]. Monitoring deviations in response time can help in preserving the performance of cloud platforms as their rapid changes, complexity and scale cause individual difficulties in the rest of the screening of these platforms.
The terms latency and response time are commutative, but they don’t mean the same thing. Response time measures the total amount of time it takes a web-API to receive, process, and respond to a client request. Latency differs in that it measures both the time it takes for the request and the time it takes for the response to pass between the client and the server. These characteristics are the key parameters that characterize the performance of web-API. It is these parameters that can be used to judge the efficiency of interaction between software components through the API.
The unit of measurement for latency and response time is milliseconds (ms), in some individual cases it may be measured in microseconds (µs). No human can notice such a change in time, can’t feel it, or even think that there has been a latency. Nevertheless, studies show that in the presence of a latency, user satisfaction drops and the number of online requests decreases, which in turn leads to negative revenue and profitability of the application [5].
Therefore, it can be concluded that it is critical for a dynamic and eco-friendly interaction between the service and the user to eliminate API latency. Therefore, it is necessary to record their presence in time and know where they come from, to study the reasons for their occurrence.
==== CAUSES OF LATENCY ====
API response time depends on network latency, which is determined by the geographical distance between the user and the server, network congestion, and the quality of the Internet connection. The further the user is from the server where the website is hosted, the more significant and noticeable the latency will be. However, there are techniques to minimize network latency, such as using content delivery networks (CDNs) or hosting APIs on multiple servers in different geographical locations [6]. 
RTT is a metric that characterizes the amount of time it takes to receive a response from a client device after a client has sent a request. RTT is equal to twice the latency since the data will travel in two identical directions [7]. 
The requested information must pass through multiple networks, not just one. It is precisely the excessive number of networks that an HTTP response must traverse that causes the higher number of latencies. The passage of data packets between networks is accompanied by movement through Internet Exchange Points (IXPs). There, routers process and give direction to the data packets. An increase in RTT can be triggered by an increase in the number of packets, which occurs as routers break down the original packets into smaller packets.
An equally important factor affecting latency is the size of the data that is sent between the client and the server. It is the payload that causes a significant change in web-API response time due to the transmission and processing of large data. Optimizing the size of this payload can be done by using data compression techniques, removing unnecessary from responses or by using efficient data formats.
Also dependency on third-party vendors plays a very important role in the origin of web-API latency. As it has been already mentioned, web-APIs are ubiquitous and publicly available, they provide functions ranging from accessing social networks to providing sophisticated transaction processing [8]. As such, the process of integrating a web-API with an application is a popular process that is not a problem. As a result, application creators do not delve into the technical part and rely on third-party organizations that don’t report to them in any way and are beyond their control. This factor can disrupt the interaction between the user and the application. The integration errors described in [9] occur due to invalid user input, missing user input, expired data requests, invalid request data, and others.
So, the most important and key reasons causing latencies are described. There are others such as low server performance, transmission medium, large amount of parallel data. 
So, by considering where the latency comes from in the web-API, you can take timely actions to prevent it, this knowledge, in my opinion, is very valuable when working with web-API and its integration with applications. Also we have proved that the presence of latency can be a cause for concern and timely application of efforts to prevent it from growing will save the reputation of the application.
==== CONCLUSION ====
A couple of new milliseconds may not seem like a noticeable change in performance to the average person, however it is compounded by internal processes including overall size, page load time required by the client and server when establishing a connection. However, these seemingly minor changes are often some of the key reasons why companies lose hundreds of thousands of dollars.
Leading experts are releasing studies and guides that provide insight into web-API design and handling of various errors, with a clear indication of the actions that need to be taken to monitor data transfer and maintain user confidence. 
In conclusion, paying attention to web-API response times and latencies, delving into their root causes, and optimizing them in a timely manner will help developers and make their work easier. All these actions will lead to maintaining a high rating of the application and avoid financial losses.
=== REFERENCES ===
  - 1. Nilsson O., Yngwe N. API Latency and User Experience: What Aspects Impact Latency and What are the Implications for Company Performance? 2022.
  - 2. Sohan S.M., Anslow C., Maurer F. A Case Study of Web API Evolution // 2015 IEEE World Congress on Services. 2015. P. 245–252.
  - 3. Brutlag J., Speed Matters for Google Web Search, 2009.
  - 4. Xu J. et al. Lightweight and Adaptive Service API Performance Monitoring in Highly Dynamic Cloud Environment // 2017 IEEE International Conference on Services Computing (SCC). Honolulu, HI, USA: IEEE, 2017. P. 35–43.
  - 5. Shankar, V., Smith, A. K., & Rangaswamy, A. (2003). Customer satisfaction and loyalty in online and offline environments. International journal of research in marketing, 20(2), 153-175.
  - 6. API Response Times: A Quick Guide to Improving Performance [Electronic resource]. URL: https://prismic.io/blog/api-response-times (accessed: 21.12.2023).
  - 7. What is latency? | How to fix latency [Electronic resource] // Cloudflare. URL: https://www.cloudflare.com/learning/performance/glossary/what-is-latency/ (accessed: 21.12.2023).
  - 8. Benchmarking Web API Quality - Revisited | River Publishers Journals & Magazine | IEEE Xplore [Electronic resource]. URL: https://ieeexplore.ieee.org/abstract/document/10247284 (accessed: 21.12.2023).
  - 9. Aué J. et al. An exploratory study on faults in web API integration in a large-scale payment company // Proceedings of the 40th International Conference on Software Engineering: Software Engineering in Practice. New York, NY, USA: Association for Computing Machinery, 2018. P. 13–22.