2017-12-13

Seconds Access Limiter for Web API with Python, PHP, Ruby, and Perl

> to Japanese Pages

1. Summary

In this article, I will describe the access limitation solution that is often required in Web APIs. In addition, I will exemplify “One-Second Access Limiter” which is one of access limit solutions using sample codes of Python, PHP, Ruby and Perl interpreter languages.

2. Introduction

In the Web API service development project, we may be presented with requirements such as “access limitation within a certain period.” For example, the requirement is such that the Web API returns the HTTP status code of “429 Too Many Requests” when the number of accesses is exceeded. These designers and developers will be forced to improve the speed and reducing the load of this process. This is because if the resource load reduction is the purpose of access limitation, it is meaningless if the logic is increasing the load. In addition, when the reference time is short and the accuracy of the result is required, the accuracy of the algorithm is required. If you are an engineer with the experience of developing Web Application Firewall (WAF), you should already know these things. In the world, there are many access limitation solutions, but in this post I will provide a sample of “One-Second Access Limiter” as one of its solutions.

3. Requirements

"Access limitation up to N times per second" 1. If the access exceeds N times per second, return the HTTP status code of "429 Too Many Requests" and block accesses. 2. However, the numerical value assigned to “N” depends on the specification of the project. 3. Because of the nature of access control for 1 second, this processing should not be a bottleneck of access processing capability.

4. Key Points of Architectures

Even from the above requirements, it must be processed as fast and light as possible.

# Prohibition of Use of Web Application Framework

Even if you are using a lightweight framework, loading a framework takes a lot of load. Therefore, this process should be implemented “before processing into the framework.”

# Libraries Loading

In order to minimize the load due to library loading, it should focus on built-in processing.

# Exception/Error Handling

Increasing the load by relying on the framework for exceptions and error handling makes no sense. These should be implemented simply in low-level code.

# Data Resource Selection

It is better to avoid heavyweight data resources like RDBMS, but in this requirement "Eventual Consistency" is not a good idea. Realizing with Loadbalancer or Reverse Proxy is also one solution, but the more the application layer is handled, the more the processing cost of the whole communication is incurred. Semi-synchronization such as memory cache and lightweight NoSQL is one option, but in this paper I use file system as data resource. In order to prevent wait processing such as file locking, it is controlled by the file name and the number of files. However, in the case of a cluster environment, a data synchronization solution is necessary.

5. Environments

The OS of sample codes is Linux. I prepared Python, PHP, Ruby, Perl as sample code languages. # "Python-3" Sample Code # "PHP-5" Sample Code # "Ruby-2" Sample Code # "Perl-5" Sample Code

6. "Python" Sample Code

Seconds Access Limiter with Python. Version: Python-3
  1. #!/usr/bin/python
  2. # coding:utf-8
  3.  
  4. import time
  5. import datetime
  6. import cgi
  7. import os
  8. from pathlib import Path
  9. import re
  10. import sys
  11. import inspect
  12. import traceback
  13. import json
  14.  
  15. # Definition
  16. def limitSecondsAccess():
  17. try:
  18. # Init
  19. ## Access Timestamp Build
  20. sec_usec_timestamp = time.time()
  21. sec_timestamp = int(sec_usec_timestamp)
  22.  
  23. ## Access Limit Default Value
  24. ### Depends on Specifications: For Example 10
  25. access_limit = 10
  26.  
  27. ## Roots Build
  28. ### Depends on Environment: For Example '/tmp'
  29. tmp_root = '/tmp'
  30. access_root = os.path.join(tmp_root, 'access')
  31.  
  32. ## Auth Key
  33. ### Depends on Specifications: For Example 'app_id'
  34. auth_key = 'app_id'
  35.  
  36. ## Response Content-Type
  37. ### Depends on Specifications: For Example JSON and UTF-8
  38. response_content_type = 'Content-Type: application/json; charset=utf-8'
  39.  
  40. ### Response Bodies Build
  41. ### Depends on Design
  42. response_bodies = {}
  43.  
  44. # Authorized Key Check
  45. query = cgi.FieldStorage()
  46. auth_id = query.getvalue(auth_key)
  47. if not auth_id:
  48. raise Exception('Unauthorized', 401)
  49. # The Auth Root Build
  50. auth_root = os.path.join(access_root, auth_id)
  51.  
  52. # The Auth Root Check
  53. if not os.path.isdir(auth_root):
  54. # The Auth Root Creation
  55. os.makedirs(auth_root, exist_ok=True)
  56.  
  57. # A Access File Creation Using Micro Timestamp
  58. ## For example, other data resources such as memory cache or RDB transaction.
  59. ## In the case of this sample code, it is lightweight because it does not require file locking and transaction processing.
  60. ## However, in the case of a cluster configuration, file system synchronization is required.
  61. access_file_path = os.path.join(auth_root, str(sec_usec_timestamp))
  62. path = Path(access_file_path)
  63. path.touch()
  64.  
  65. # The Access Counts Check
  66. access_counts = 0
  67. for base_name in os.listdir(auth_root):
  68. ## A Access File Path Build
  69. file_path = os.path.join(auth_root, base_name)
  70.  
  71. ## Not File Type
  72. if not os.path.isfile(file_path):
  73. continue
  74.  
  75. ## The Base Name Data Type Casting
  76. base_name_sec_usec_timestamp = float(base_name)
  77. base_name_sec_timestamp = int(base_name_sec_usec_timestamp)
  78.  
  79. ## Same Seconds Stampstamp
  80. if sec_timestamp == base_name_sec_timestamp:
  81.  
  82. ### A Overtaken Processing
  83. if sec_usec_timestamp < base_name_sec_usec_timestamp:
  84. continue
  85.  
  86. ### Access Counts Increment
  87. access_counts += 1
  88.  
  89. ### Too Many Requests
  90. if access_counts > access_limit:
  91. raise Exception('Too Many Requests', 429)
  92.  
  93. continue
  94.  
  95. ## Past Access Files Garbage Collection
  96. if sec_timestamp > base_name_sec_timestamp:
  97. os.remove(file_path)
  98.  
  99. except Exception as e:
  100. # Exception Tuple to HTTP Status Code
  101. http_status = e.args[0]
  102. http_code = e.args[1]
  103.  
  104. # 4xx
  105. if http_code >= 400 and http_code <= 499:
  106. # logging
  107. ## snip...
  108. # 5xx
  109. elif http_code >= 500:
  110. # logging
  111. # snip...
  112.  
  113. ## The Exception Message to HTTP Status
  114. http_status = 'foo'
  115. else:
  116. # Logging
  117. ## snip...
  118.  
  119. # HTTP Status Code for The Response
  120. http_status = 'Internal Server Error'
  121. http_code = 500
  122.  
  123. # Response Headers Feed
  124. print('Status: ' + str(http_code) + ' ' + http_status)
  125. print(response_content_type + "\n\n")
  126.  
  127. # A Response Body Build
  128. response_bodies['message'] = http_status
  129. response_body = json.dumps(response_bodies)
  130.  
  131. # The Response Body Feed
  132. print(response_body)
  133.  
  134. # Excecution
  135. limitSecondsAccess()

7. "PHP" Sample Code

Seconds Access Limiter with PHP Version: PHP-5
  1. <?php
  2. # Definition
  3. function limitSecondsAccess()
  4. {
  5. try {
  6. # Init
  7. ## Access Timestamp Build
  8. $sec_usec_timestamp = microtime(true);
  9. list($sec_timestamp, $usec_timestamp) = explode('.', $sec_usec_timestamp);
  10.  
  11. ## Access Limit Default Value
  12. ### Depends on Specifications: For Example 10
  13. $access_limit = 10;
  14.  
  15. ## Roots Build
  16. ### Depends on Environment: For Example '/tmp'
  17. $tmp_root = '/tmp';
  18. $access_root = $tmp_root . '/access';
  19.  
  20. ## Auth Key
  21. ### Depends on Specifications: For Example 'app_id'
  22. $auth_key = 'app_id';
  23.  
  24. ## Response Content-Type
  25. ## Depends on Specifications: For Example JSON and UTF-8
  26. $response_content_type = 'Content-Type: application/json; charset=utf-8';
  27.  
  28. ## Response Bodies Build
  29. ### Depends on Design
  30. $response_bodies = array();
  31.  
  32. # Authorized Key Check
  33. if (empty($_REQUEST[$auth_key])) {
  34. throw new Exception('Unauthorized', 401);
  35. }
  36. $auth_id = $_REQUEST[$auth_key];
  37.  
  38. # The Auth Root Build
  39. $auth_root = $access_root . '/' . $auth_id;
  40.  
  41. # The Auth Root Check
  42. if (! is_dir($auth_root)) {
  43. ## The Auth Root Creation
  44. if (! mkdir($auth_root, 0775, true)) {
  45. throw new Exception('Could not create the auth root. ' . $auth_root, 500);
  46. }
  47. }
  48.  
  49. # A Access File Creation Using Micro Timestamp
  50. /* For example, other data resources such as memory cache or RDB transaction.
  51. * In the case of this sample code, it is lightweight because it does not require file locking and transaction processing.
  52. * However, in the case of a cluster configuration, file system synchronization is required.
  53. */
  54. $access_file_path = $auth_root . '/' . strval($sec_usec_timestamp);
  55. if (! touch($access_file_path)) {
  56. throw new Exception('Could not create the access file. ' . $access_file_path, 500);
  57. }
  58.  
  59. # The Auth Root Scanning
  60. if (! $base_names = scandir($auth_root)) {
  61. throw new Exception('Could not scan the auth root. ' . $auth_root, 500);
  62. }
  63.  
  64. # The Access Counts Check
  65. $access_counts = 0;
  66. foreach ($base_names as $base_name) {
  67. ## A current or parent dir
  68. if ($base_name === '.' || $base_name === '..') {
  69. continue;
  70. }
  71.  
  72. ## A Access File Path Build
  73. $file_path = $auth_root . '/' . $base_name;
  74.  
  75. ## Not File Type
  76. if (! is_file($file_path)) {
  77. continue;
  78. }
  79.  
  80. ## The Base Name to Integer Data Type
  81. $base_name_sec_timestamp = intval($base_name);
  82.  
  83. ## Same Seconds Timestamp
  84. if ($sec_timestamp === $base_name_sec_timestamp) {
  85. ## The Base Name to Float Data Type
  86. $base_name_sec_usec_timestamp = floatval($base_name);
  87.  
  88. ### A Overtaken Processing
  89. if ($sec_usec_timestamp < $base_name_sec_usec_timestamp) {
  90. continue;
  91. }
  92.  
  93. ### Access Counts Increment
  94. $access_counts++;
  95.  
  96. ### Too Many Requests
  97. if ($access_counts > $access_limit) {
  98. throw new Exception('Too Many Requests', 429);
  99. }
  100.  
  101. continue;
  102. }
  103.  
  104. ## Past Access Files Garbage Collection
  105. if ($sec_timestamp > $base_name_sec_timestamp) {
  106. @unlink($file_path);
  107. }
  108. }
  109. } catch (Exception $e) {
  110. # The Exception to HTTP Status Code
  111. $http_code = $e->getCode();
  112. $http_status = $e->getMessage();
  113.  
  114. # 4xx
  115. if ($http_code >= 400 && $http_code <= 499) {
  116. # logging
  117. ## snip...
  118. # 5xx
  119. } else if ($http_code >= 500) {
  120. # logging
  121. ## snip...
  122.  
  123. # The Exception Message to HTTP Status
  124. $http_status = 'foo';
  125. # Others
  126. } else {
  127. # Logging
  128. ## snip...
  129.  
  130. # HTTP Status Code for The Response
  131. $http_status = 'Internal Server Error';
  132. $http_code = 500;
  133. }
  134.  
  135. # Response Headers Feed
  136. header('HTTP/1.1 ' . $http_code . ' ' . $http_status);
  137. header($response_content_type);
  138.  
  139. # A Response Body Build
  140. $response_bodies['message'] = $http_status;
  141. $response_body = json_encode($response_bodies);
  142. # The Response Body Feed
  143. exit($response_body);
  144. }
  145. }
  146.  
  147. # Execution
  148. limitSecondsAccess();
  149. ?>

8. "Ruby" Sample Code

Seconds Access Limiter with Ruby Version: Ruby-2
  1. # Definition#!/usr/bin/ruby
  2. # -*- coding: utf-8 -*-
  3.  
  4. require 'time'
  5. require 'fileutils'
  6. require 'cgi'
  7. require 'json'
  8.  
  9. def limitScondsAccess
  10.  
  11. begin
  12. # Init
  13. ## Access Timestamp Build
  14. time = Time.now
  15. sec_timestamp = time.to_i
  16. sec_usec_timestamp_string = "%10.6f" % time.to_f
  17. sec_usec_timestamp = sec_usec_timestamp_string.to_f
  18.  
  19. ## Access Limit Default Value
  20. ### Depends on Specifications: For Example 10
  21. access_limit = 10
  22.  
  23. ## Roots Build
  24. ### Depends on Environment: For Example '/tmp'
  25. tmp_root = '/tmp'
  26. access_root = tmp_root + '/access'
  27.  
  28. ## Auth Key
  29. ### Depends on Specifications: For Example 'app_id'
  30. auth_key = 'app_id'
  31.  
  32. ## Response Content-Type
  33. ### Depends on Specifications: For Example JSON and UTF-8
  34. response_content_type = 'application/json'
  35. response_charset = 'utf-8'
  36.  
  37. ## Response Bodies Build
  38. ### Depends on Design
  39. response_bodies = {}
  40.  
  41. # Authorized Key Check
  42. cgi = CGI.new
  43. if ! cgi.has_key?(auth_key) then
  44. raise 'Unauthorized:401'
  45. end
  46. auth_id = cgi[auth_key]
  47.  
  48. # The Auth Root Build
  49. auth_root = access_root + '/' + auth_id
  50.  
  51. # The Auth Root Check
  52. if ! FileTest::directory?(auth_root) then
  53. # The Auth Root Creation
  54. if ! FileUtils.mkdir_p(auth_root, :mode => 0775) then
  55. raise 'Could not create the auth root. ' + auth_root + ':500'
  56. end
  57. end
  58.  
  59. # A Access File Creation Using Micro Timestamp
  60. ## For example, other data resources such as memory cache or RDB transaction.
  61. ## In the case of this sample code, it is lightweight because it does not require file locking and transaction processing.
  62. ## However, in the case of a cluster configuration, file system synchronization is required.
  63. access_file_path = auth_root + '/' + sec_usec_timestamp.to_s
  64. if ! FileUtils::touch(access_file_path) then
  65. raise 'Could not create the access file. ' + access_file_path + ':500'
  66. end
  67.  
  68. # The Access Counts Check
  69. access_counts = 0
  70. Dir.glob(auth_root + '/*') do |access_file_path|
  71.  
  72. # Not File Type
  73. if ! FileTest::file?(access_file_path) then
  74. next
  75. end
  76.  
  77. # The File Path to The Base Name
  78. base_name = File.basename(access_file_path)
  79.  
  80. # The Base Name to Integer Data Type
  81. base_name_sec_timestamp = base_name.to_i
  82.  
  83. # Same Seconds Timestamp
  84. if sec_timestamp == base_name_sec_timestamp then
  85.  
  86. ### The Base Name to Float Data Type
  87. base_name_sec_usec_timestamp = base_name.to_f
  88.  
  89. ### A Overtaken Processing
  90. if sec_usec_timestamp < base_name_sec_usec_timestamp then
  91. next
  92. end
  93.  
  94. ### Access Counts Increment
  95. access_counts += 1
  96.  
  97. ### Too Many Requests
  98. if access_counts > access_limit then
  99. raise 'Too Many Requests:429'
  100. end
  101.  
  102. next
  103. end
  104.  
  105. # Past Access Files Garbage Collection
  106. if sec_timestamp > base_name_sec_timestamp then
  107. File.unlink access_file_path
  108. end
  109. end
  110.  
  111. # The Response Feed
  112. cgi.out({
  113. ## Response Headers Feed
  114. 'type' => 'text/html',
  115. 'charset' => response_charset,
  116. }) {
  117. ## The Response Body Feed
  118. ''
  119. }
  120.  
  121. rescue => e
  122. # Exception to HTTP Status Code
  123. messages = e.message.split(':')
  124. http_status = messages[0]
  125. http_code = messages[1]
  126.  
  127. # 4xx
  128. if http_code >= '400' && http_code <= '499' then
  129. # logging
  130. ## snip...
  131. # 5xx
  132. elsif http_code >= '500' then
  133. # logging
  134. ## snip...
  135.  
  136. # The Exception Message to HTTP Status
  137. http_status = 'foo'
  138. else
  139. # Logging
  140. ## snip...
  141.  
  142. # HTTP Status Code for The Response
  143. http_status = 'Internal Server Error'
  144. http_code = '500'
  145. end
  146.  
  147. # The Response Body Build
  148. response_bodies['message'] = http_status
  149. response_body = JSON.generate(response_bodies)
  150.  
  151. # The Response Feed
  152. cgi.out({
  153. ## Response Headers Feed
  154. 'status' => http_code + ' ' + http_status,
  155. 'type' => response_content_type,
  156. 'charset' => response_charset,
  157. }) {
  158. ## The Response Body Feed
  159. response_body
  160. }
  161. end
  162. end
  163.  
  164. limitScondsAccess

9. "Perl" Sample Code

Seconds Access Limiter with Perl Version: Perl-5
  1. #!/usr/bin/perl
  2.  
  3. use strict;
  4. use warnings;
  5. use utf8;
  6. use Time::HiRes qw(gettimeofday);
  7. use CGI;
  8. use File::Basename;
  9. use JSON;
  10.  
  11. # Definition
  12. sub limitSecondsAccess {
  13.  
  14. eval {
  15. # Init
  16. ## Access Timestamp Build
  17. my ($sec_timestamp, $usec_timestamp) = gettimeofday();
  18. my $sec_usec_timestamp = ($sec_timestamp . '.' . $usec_timestamp) + 0;
  19.  
  20. ## Access Limit Default Value
  21. ### Depends on Specifications: For Example 10
  22. my $access_limit = 10;
  23.  
  24. ## Roots Build
  25. ### Depends on Environment: For Example '/tmp'
  26. my $tmp_root = '/tmp';
  27. my $access_root = $tmp_root . '/access';
  28.  
  29. ## Auth Key
  30. ### Depends on Specifications: For Example 'app_id'
  31. my $auth_key = 'app_id';
  32.  
  33. ## Response Content-Type
  34. ## Depends on Specifications: For Example JSON and UTF-8
  35.  
  36. ## Response Bodies Build
  37. ### Depends on Design
  38. my %response_bodies;
  39.  
  40. # Authorized Key Check
  41. my $CGI = new CGI;
  42. if (! defined($CGI->param($auth_key))) {
  43. die('Unauthorized`401`');
  44. }
  45. my $auth_id = $CGI->param($auth_key);
  46.  
  47. # The Auth Root Build
  48. my $auth_root = $access_root . '/' . $auth_id;
  49.  
  50. # The Access Root Check
  51. if (! -d $access_root) {
  52. ## The Access Root Creation
  53. if (! mkdir($access_root)) {
  54. die('Could not create the access root. ' . $access_root . '`500`');
  55. }
  56. }
  57.  
  58. # The Auth Root Check
  59. if (! -d $auth_root) {
  60. ## The Auth Root Creation
  61. if (! mkdir($auth_root)) {
  62. die('Could not create the auth root. ' . $auth_root . '`500`');
  63. }
  64. }
  65.  
  66. # A Access File Creation Using Micro Timestamp
  67. ## For example, other data resources such as memory cache or RDB transaction.
  68. ## In the case of this sample code, it is lightweight because it does not require file locking and transaction processing.
  69. ## However, in the case of a cluster configuration, file system synchronization is required.
  70. my $access_file_path = $auth_root . '/' . $sec_usec_timestamp;
  71. if (! open(FH, '>', $access_file_path)) {
  72. close FH;
  73. die('Could not create the access file. ' . $access_file_path . '`500`');
  74. }
  75. close FH;
  76.  
  77. # The Auth Root Scanning
  78. my @file_pathes = glob($auth_root . "/*");
  79. if (! @file_pathes) {
  80. die('Could not scan the auth root. ' . $auth_root . '`500`');
  81. }
  82.  
  83. # The Access Counts Check
  84. my $access_counts = 0;
  85. foreach my $file_path (@file_pathes) {
  86.  
  87. ## Not File Type
  88. if (! -f $file_path) {
  89. next;
  90. }
  91.  
  92. ## The Base Name Extract
  93. my $base_name = basename($file_path);
  94.  
  95. ## The Base Name to Integer Data Type
  96. my $base_name_sec_timestamp = int($base_name);
  97.  
  98. ## Same Seconds Timestamp
  99. if ($sec_timestamp eq $base_name_sec_timestamp) {
  100. ## The Base Name to Float Data Type
  101. my $base_name_sec_usec_timestamp = $base_name;
  102.  
  103. ### A Overtaken Processing
  104. if ($sec_usec_timestamp lt $base_name_sec_usec_timestamp) {
  105. next;
  106. }
  107.  
  108. ### Access Counts Increment
  109. $access_counts++;
  110.  
  111. ### Too Many Requests
  112. if ($access_counts > $access_limit) {
  113. die("Too Many Requests`429`");
  114. }
  115.  
  116. next;
  117. }
  118.  
  119. ## Past Access Files Garbage Collection
  120. if ($sec_timestamp gt $base_name_sec_timestamp) {
  121. unlink($file_path);
  122. }
  123. }
  124. };
  125.  
  126. if ($@) {
  127. # Error Elements Extract
  128. my @e = split(/`/, $@);
  129.  
  130. # Exception to HTTP Status Code
  131. my $http_status = $e[0];
  132. my $http_code = '0';
  133. if (defined($e[1])) {
  134. $http_code = $e[1];
  135. }
  136.  
  137. # 4xx
  138. if ($http_code ge '400' && $http_code le '499') {
  139. # logging
  140. ## snip...
  141. # 5xx
  142. } elsif ($http_code ge '500') {
  143. # logging
  144. ## snip...
  145.  
  146. ## The Exception Message to HTTP Status
  147. $http_status = 'foo';
  148. # Others
  149. } else {
  150. # logging
  151. ## snip...
  152.  
  153. $http_status = 'Internal Server Error';
  154. $http_code = '500';
  155. }
  156.  
  157. # Response Headers Feed
  158. print("Status: " . $http_code . " " . $http_status . "\n");
  159. print('Content-Type: application/json; charset=utf-8' . "\n\n");
  160.  
  161. # A Response Body Build
  162. my %response_bodies;
  163. $response_bodies{'message'} = $http_status;
  164. $a = \%response_bodies;
  165. my $response_body = encode_json($a);
  166.  
  167. # The Response Body Feed
  168. print($response_body);
  169. }
  170.  
  171. }
  172.  
  173. # #Excecution
  174. &limitSecondsAccess();

10. Conclusion

In this post, I exemplified a sample of “One-Second Access limiter” solution using Python, PHP, Ruby and Perl interpreter languages. Because of the nature of “access control for one second”, it will be understood that low load, high speed processing and data consistency are required. Therefore, although there are some important points, they are as described in the architecture section. In this post, I showed a solution using file name and file number of file system. However, in a clustered environment, it is unsuitable for this architecture if the selected data synchronization solution is slow. In such cases, the asynchronous data architecture may be one of the options rather. In such a case, control is made on a per-node basis. Furthermore, the importance of the load balancing threshold is increased, and the precision of the access limitation and consistency of the result must be abandoned. However, unless precision of access limitation and consistency of results are required, it is also one.

No comments:

Post a Comment