My Products
Help

New throttling policy is in the planning stage: "Concurrent Requests Limit Policy"

by Yıldırım (Updated ‎28-09-2021 17:23 by Yıldırım VISMA )

We're getting ready to introduce a new Throttling Policy for the Visma.net Financials API which is called "Concurrent Requests Limit Policy". The reason behind this methodology is to be able to maintain a more stable performance while the API continues to grow with solid optimizations.

 

Currently, we're planning to implement this policy with "LOG ONLY"  mode enabled.
Meaning, the transactions made by the Clients who are sending Parallel / Concurrent calls are not going to be throttled during this period. 

 

In this way, we'll be able to gather more information about Clients / API Transactions so that we can analyse the inquiry behaviour. This will give us the opportunity to determine the optimal plausible limits to ensure the best and stable performance that our infrastructure can sustain with.

 

 

Information about "Concurrent Requests Limit Policy"

This policy is for limiting the number of concurrent requests a client can have per company/tenant. This policy doesn’t use a time interval for the check: it counts the “running” requests (not completed) of the calling client (on the requested company/tenant) at the time the check is performed.

For logging purposes, "only_log" - if set to true the policy will not throttle the request, just log the number of running requests the client has on the invoked company.

 

Please feel free to share your thoughts with us here or via developersupport@visma.com.

Thanks.

12 Comments
adrianm
PARTNER
by adrianm

I have a question regarding the reasoning behind the throttling policy.

 

If I need to do 10 requests to receive the information I need.

Why is it better to implement throttling on the client instead of sending everything and let the server sort it out?

 

On the server side you have all kinds of information on current load etc and can prioritize the requests. On the client side we must fall back to the slowest solution (one request at a time or whatever limit you impose) regardless if your servers are busy or idle.

 

(I have full understanding if you need to set a limit to avoid DOS-attacks etc but it is the performance part I don't understand)

Yıldırım
VISMA
by Yıldırım (Updated ‎04-10-2021 11:21 by Yıldırım VISMA )

Hello,

concurrent Requests Limit Policy will be implemented in the server-side based on load balancing. (How many parallel requests allowed will be grounded in once the behaviour analysis is complete.)

The client can still issue parallel requests, but the client should pay attention to the response status code. If the request is throttled by the server, a 429 status code will be in the response, and the client should detect this and know that the request must be retried later.

adrianm
PARTNER
by adrianm

> If the request is throttled by the server, a 429 status code will be in the response

 

Sorry, but that is even worse (and that is not what throttling means).

You are forcing each client to implement a retry policy everywhere because any request can then randomly return 429. And what does "later" mean? The client still need the data so why not just delay the response in case the server is to busy.

mrtnsn
CONTRIBUTOR ***
by mrtnsn (Updated ‎05-10-2021 08:33 by mrtnsn )

a 429 status code will be in the response, and the client should detect this and know that the request must be retried later.

 

If this is the case then please also provide "Retry-After" header with the seconds until the request can be run again. That way we can at least easily queue the request.

 

Also, you should probably clarify if your talking "Rate limiting" or "Throttling", because this sounds a lot like "Rate limiting".

Yıldırım
VISMA
by Yıldırım (Updated ‎14-10-2021 14:21 by Yıldırım VISMA )

Hello, 

we're currently establishing more background information in detail hence we hope this will help to clarify your concerns / questions.  Thank you for your understanding.

 

AliMKhan
VISMA
by AliMKhan (Updated ‎18-10-2021 11:01 by AliMKhan VISMA )

This throttling or rate-limiting policy will reject the next request and give it a 429 response if there are already X_number_of_requests in progress from the client on a company.

 

The response will contain the header "x-rate-limit-policy: ConcurrentRequestsLimitPolicy” to tell the client which exact policy was violated. The server will not send a retry-after or similar header, since the server will not know how long it takes before the current requests are finished executing. 

 

This is why the client itself needs to keep track of its requests. The client should have a counter for concurrent requests.

It needs to be increased on every new request sent and decreased on every response received. 

The counter should be in "shared" storage if the client exists as several instances.

This way, the client will know if it is allowed to issue the next request; when counter < X_number_of_requests.

 

We are still gathering data to best determine how many concurrent calls we should be allowed.
The X_number_of_requests variable will be clearly communicated beforehand.

 

We do not support queueing requests, other than what is standard for web servers. Neither will we implement it for the normal flow, where the purpose is a request/response pattern. Such queueing of requests would soon result in timeout errors on the clients, since they would have to wait for the responses. 

 

However, we are working on a way to call the APIs with a special header to indicate that the entire request will be persisted in a storage or queue and will respond immediately with 202-Accepted, while the actual invocation is done as a background task. There will be a webhook notification when a response is available. 

This approach is suitable for clients that do not need to wait for a response in order to continue its flow, or for clients that need to carry out an operation via the API that will take a while. This work is in progress and is expected to be delivered in Q1/22.

adrianm
PARTNER
by adrianm

>We do not support queueing requests, other than what is standard for web servers.

>Neither will we implement it for the normal flow

..
>This is why the client itself needs to keep track of its requests.

>The client should have a counter for concurrent requests.

>It needs to be increased on every new request sent and decreased on every response received.


Interesting way to treat your customers. Let them solve your performance problems by breaking existing clients.


>The counter should be in "shared" storage if the client exists as several instances.


My customers are definitely not interested in paying me to develop and test such complicated solution (where the only difference for them is a slower solution).

Since you have already implemented the request counting on the server, to return 429, I will in most cases use that functionality and just do a retry on 429+ConcurrentRequestsLimitPolicy until it succeeds.

 

Still don't understand what real world problem you are trying to solve here. Do you get a lot of "unnecessary" requests you hope will disappear with this solution? Have you asked the API-users why they send so many requests?

roy_muller
VISMA
by roy_muller (Updated ‎22-10-2021 08:56 by roy_muller VISMA )

>Interesting way to treat your customers. Let them solve your performance problems by breaking existing clients.


Reason behind this is that some customers are in fact issuing a lot of concurrent requests, and some of these require rather heavy queries on the database. Such api requests will have a long lifetime and will cause servers to be drained of resources if not limited, and other api requests (also for other clients) will slow down and might also time out. So, if not handled, you will notice clients breaking, and that would not be a nice way to treat customers.

 

>Since you have already implemented the request counting on the server, to return 429, I will in most cases use that functionality and just do a retry on 429+ConcurrentRequestsLimitPolicy until it succeeds.

 

Yes, you can do that, and if you have that 4** response status handling in place (as you should) and a retry policy in place, then your client is not breaking by this. But instead of hammering with numerous retries in all situations, it is a much better solution to inform what is the exact problem so that a client can more elegantly handle it.

 

>Still don't understand what real world problem you are trying to solve here. Do you get a lot of "unnecessary" requests you hope will disappear with this solution? Have you asked the API-users why they send so many requests?

 
The real world problem is that resources are not unlimited. So instead of letting clients fail miserably, it is better to share the availability in a controlled way. We will of course at the same time discuss alternative solutions with those integrators that send thousands of simultaneous API requests. We will also continuously work to improve endpoints that have performance issues and better scale our server resources. But as long as there is no solution to all problems overnight, rate-limiting seems in order. The 429 status code was "invented" for a reason. In the real world most services do not allow unlimited concurrent requests -- this could even be interpreted as a DOS or DDOS attack by security measures.
 
plh
CONTRIBUTOR ***
by plh

Hello

 

I do agree with adrianm on this matter. Seems that you are moving your problems to your partners in this case. 

We already have a hard time integrating with VISMA.net, sorry, i do understand you might have performance issues, we have reported feature requests to make this better, but sadly those request are stored away in the backlog for years and years. 

 

We do not any possibility to call a "multi request" to fetch say 100 items - we're forced to call visma inventory api 100 times for 100 specific items instead of a single api request for 100 items in one call? No wonder you get a lot of requests. 

Help us improve api utilization instead of setting up even more limits 😞

 

As reported a long time ago, many many requests for visma api takes up several seconds - i imaging this is putting pressure on servers/databases - perhaps (I am guessing) there is room for improvement here, such simple get requests for a inventory item can be executed fasted, would free resources? We can't filter returned data, if i need a single field say "stock" i need to call an endpoint that returns 100's of data fields? 

When comparing api request timings with other ERP systems that we integrate with, we see much faster responses from other platforms. 

 

My take is, that you should improve your api offering to your partners before setting more restrictive limits. 

 

From a partner perspective it feels wrong that our feature requests for improved api and performance is not prioritized but yet, you do have development resources for developing these new restrictions. 

roy_muller
VISMA
by roy_muller

The "restrictive limit" we are talking about here is to limit the number of concurrent/simultanious/parallel requests for the same client/tenant combo. This limitation won't affect anyone that is not issuing a lot of concurrent requests for the same company.

 

Did you try to issue 1000 requests in parallel towards the same company in one of the other systems you integrate with?

plh
CONTRIBUTOR ***
by plh

Hi Roy.

 

I am aware of how the limitation is supposed to work. 

 

No, i am not issuing 1000 concurrent requests to other systems - nor do i towards visma.net.

 

Microsoft Business Central has limits for concurrency set to 5 for API, which makes good sense for not having api consumers misuse the system resources, and still allowing for some concurrency.

https://docs.microsoft.com/en-us/dynamics365/business-central/dev-itpro/api-reference/v1.0/dynamics-...

 

However, when making calls towards your system may take up to 30 seconds to respond, and 10 people are using our system integrated against your system, they will have to wait for the other peoples request to finish before theirs is even queued up. This would make our application unresponsive for several other customer employees. 

 

The slower the API is, the more requests naturally pile up as "concurrent", this needs to be taken into account. 

 

What are your plans for the limit, is it 1, 5 or 25 concurrent requests allowed ? 

 

AliMKhan
VISMA
by AliMKhan

Hello.

Sorry for the late reply.

1) As mentioned earlier by Roy, we are working on improving our endpoints, so they can respond faster and deliver more specific information instead of everything as of now.
One example of this work is the neXtGen SalesOrderService which includes v3/salesorder, v3/inventory and v3/customer

More info about it can be found here:
Getting started with the first neXtGen service: Visma.net ERP Sales Order API 

 

2) As you are mentioning, we are running in a multi tenant environment with shared resources, and we have to have in place some measurements to ensure the quality of the service.
That is the reasoning behind having a concurrency policy.
We have not landed on a number yet, but it will most likely be a percentage of the resources available in the environment.