OpenResty-based SOAP caching proxy for CNPq / Lattes Platform WebServices. [Português]
The Lattes Platform is a government-maintained public database containing Brazilian researchers curricula, which can be accessed by anyone using a web browser. Raw curriculum metadata in XML format can also be obtained, but its automated download (without a CAPTCHA) is only officially permitted through a SOAP webservice, which is made available only to Brazilian research and higher education institutions. However, each institution can request access rights only for a single IP address. Thus, this proxy has the following goals:
To allow the institution to manage its own internal list of IP addresses which are allowed to access the webservice.
To ensure multiple applications from the same institution accessing the webservice will not cause a significant overhead on CNPq servers, by caching responses whenever possible.
To preserve compatibility with any existing applications. Replacing the webservice address in the WSDL file or overriding DNS response using
should be enough to cause an application to use the proxy.
This project also aims to serve as a good example on how to create smart SOAP caches. Aside from some specific configurations (conf/
) and from specific hooks (wscallhooks
) in scripts/cnpqws_cache_key.lua
, the code should be pretty generic and adaptable for caching other SOAP webservices.
OpenResty is a platform build around nginx, LuaJIT, and some extension modules, on the top of which our proxy was developed.
Please follow the instructions here to install official OpenResty packages for your distro.
Initialize the repository submodules:
git submodule init && git submodule update
Generate a private key and the corresponding certificate for the embedded TLS server. You may provide certificate information to openssl through the
environment variable. Then call:make tls-conf
If you want to use a valid TLS certificate, sign
using your preferred CA and overrideconf/tls/private/cnpqwsproxy.crt
with the certificate provided by the CA. -
. Thewsdl.replacewith
option contains an URL whose domain should be changed to your institution's domain.If desired, also customize any caching times in this file. Please note that although curricula are cached for 6 months by default, the
defined inscripts/cnpqws_cache_key.lua
implies that the modification time will always be looked up to check whether a new version of a curriculum is available. The modification time is cached, by default, only for 15 minutes. -
Add the IP address ranges which are allowed to access the proxy to the
file. By default, the proxy only allows requests from localhost. -
Start the proxy application:
./app start
Install ipython and SUDS:
sudo apt-get install ipython python-suds
and try some webservice calls:In [1]: import suds In [2]: c = suds.client.Client('https://your-domain:7443/srvcurriculo/WSCurriculo?wsdl') In [3]: c.service.getDataAtualizacaoCV(id='9110045494449064') # in memoriam Out[3]: 27/05/2004 11:39:54
Install Maven.
sudo apt-get install maven
Clone the cxf-repl project:
git clone
Change WSDL files to point to your own domain:
cd cxf-repl
sed -i 's/' src/main/resources/wsdl/*.wsdl
Compile and call the REPL:
make repl
Try some webservice calls:
>>> import as WSCurriculo_Service >>> service = WSCurriculo_Service() >>> port = service.WSCurriculoPort >>> port.getDataAtualizacaoCV('9110045494449064') # in memoriam u'27/05/2004 11:39:54'