You can know talk to @CastroTiempo and not only that, it will respond you

You can now talk to @CastroTiempo Twitter bot, and not only that, it will respond you!

Implementing some simple interaction capabilities as the previous step to Natural Processing Language implementation, this time you can ask it how the weather is like in basically any city around the world.

Hola @luisvolumentres, en Kiruna hay cielos poco nubladoss y la temperatura es de 1.0 ºC, para obtener mas información pincha en https://t.co/5dO5q6RkKX
— El Tiempo en Castro (@CastroTiempo) April 14, 2021

So I was tinkering about how I could improve the @CastroTiempo Twitter Bot as it had many flaws and it still does. The first ideas that came to my mind were the following ones:

Forecasting capabilities
Interaction capabilities
Update the weather report it provides with the temperature (obviously) and whether it was raining there were clear skies or whatever.
To put the data in a public access database so there's a repository of information that everyone can access and it might serve in the future to study the weather.
To upload a picture of the city at the moment of the report, and to put that picture in the database altogether, this feature was actually in the previous version but due to some issues with the Raspberry Pi, I could not add it.

So well I decided to fix the second and third points which I thought were the more urgent ones. The base of it has been the previous code, the one you find here: https://achefethings.blogspot.com/2021/03/web-scraping-to-feed-twitter-bot.html.

The second point was just a matter of finding somewhere where I can get that info, in my case I used the OpenWeatherMap API was rather simple.

The third point was the tricky one, but I (mostly) pulled it off, for it to work you need to mention the bot and add the name of the city you want to know the weather at.

This time the task was getting rather since many parts had to come together, namely:

The Web Scraping part of it:

OpenWeatherMap API for the temperature the skies reports (https://openweathermap.org/)
AQI CN pure scraping for the other variables, mainly the air quality index and the contaminants (https://aqicn.org/map/spain/cantabria/castro-urdiales/)

from pprint import pprint
import tweepy
import requests
from bs4 import BeautifulSoup
from datetime import datetime
import pytz
import re
import googletrans
from googletrans import Translator
translator = Translator()


auth = tweepy.OAuthHandler("secret", "secret2")
auth.set_access_token("token1", "token2")

api = tweepy.API(auth)

API_key = 'open_weather_key'

city = "Castro-Urdiales"
base_url = "http://api.openweathermap.org/data/2.5/weather?appid=" + API_key + "&q=" + city

       weather_data = requests.get(base_url).json()

        tiempo = weather_data["weather"][0]["description"]

        temp_curr = str(float(weather_data["main"]["temp"])-273.15)[0:4]
        temp_curr_feels_like = str(float(weather_data["main"]["feels_like"])-273.15)[0:4]
        temperature = str((f"La temperatura actual es de {temp_curr} ºC pero se sienten como {temp_curr_feels_like} ºC\n"))

        temp_min = str(float(weather_data["main"]["temp_min"])-273.15)[0:4]
        temp_max = str(float(weather_data["main"]["temp_max"])-273.15)[0:4]
        min_max = str((f"La maxima y minima para hoy son de {temp_min} y {temp_max} ºC \n"))

        hum = str(float(weather_data["main"]["humidity"]))[0:4]
        pres = str(float(weather_data["main"]["pressure"]))[0:4]
        hum_press = str(f"La humedad es del {hum} % y la presión atmosférica de {pres} hPa\n")

        pattern = re.compile(r"(>\d{2}<)|(>\d{3}<)|(>\d{1}<)|(>\d{0}<)|(>-<)|(>\d{4}<)")
        res = requests.get("https://aqicn.org/city/spain/cantabria/castro-urdiales/es/")
        soup = BeautifulSoup(res.text, 'html.parser')

        wind = str(soup.find(id="cur_w"))
        wind_str = (pattern.search(wind)).group()
        wind_deg = str(float(weather_data["wind"]["deg"]))[0:6]

        if(315 < float(wind_deg) < 360 or 0 < float(wind_deg) < 45):
            viento = (f"Tenemos viento norte a {wind_str[1:len(wind_str)-1]} km/h\n")
        elif(45 < float(wind_deg) < 135):
            viento = (f"Tenemos viento del este a {wind_str[1:len(wind_str)-1]} km/h\n")
        elif(135 < float(wind_deg) < 225):
            viento = (f"Tenemos viento del sur a {wind_str[1:len(wind_str)-1]} km/h\n")
        elif(225 < float(wind_deg) < 315 ):
            viento = (f"Tenemos viento del oeste a {wind_str[1:len(wind_str)-1]} km/h\n")
        else:
            viento = ""


        ACI = str(soup.find(id="aqiwgtvalue"))
        ACI_str = (pattern.search(ACI)).group()
        calidad = str(f"El nivel de Calidad del Aire es  {ACI_str[1:len(ACI_str) - 1]}\n")

The translation part of it:

As the information grabbed is in English I had to use an API for translating the information and making special cases for the translation to fit the style of my reports (regarding grammar)

        translation = translator.translate(tiempo, dest='es')

        if (translation.text == "cielo limpio"):
            translation.text = "cielos despejados"
        elif (translation.text == "pocas nubes"):
            translation.text = "cielos poco nubladoss"
        elif (translation.text == "nubes rotas"):
            translation.text = "nubes dispersas"
        elif (translation.text == "nubes nubladas"):
            translation.text = "cielos nublados"
        elif (translation.text == "lluvia ligera"):
            translation.text = "lluvias ligeras"
        elif (translation.text == "neblina"):
            translation.text = "niebla"
        elif (translation.text == "lluvia intensa"):
            translation.text = "lluvias intensas"

        t1 = str(f"Tenemos {translation.text}\n")

The interactive part of it:

The program needs to be able to read when someone mentions the Bot.

  i = 0
        while(i<60):
            i = i + 1
            file_name = "twitter_mentions.txt"
            try:
                current_mentions = []
                with open(file_name) as f:
                    for line in f:
                        line = line.strip("\n")
                        current_mentions.append(line)
                with open(file_name,"a") as f:
                    for mention in api.mentions_timeline(count=1):
                        f.write("@"+mention.user.screen_name+" mentioned you: " + mention.text + "\n")
                new_mentions = []
                with open(file_name) as f:
                    for line in f:
                        line = line.strip("\n")
                        new_mentions.append(line)
                current_mentions_set = set(current_mentions)
                new_mentions_set = set(new_mentions)
                new_mentions_list = new_mentions_set.difference(current_mentions_set)
                for new_mention in new_mentions_list:
                    reply = new_mention.split()[0]
                    text_tweet = mention.text
                    words_text_tweet = text_tweet.split()
                    if (len(words_text_tweet)) == 2:
                        words_text_tweet.remove("@CastroTiempo")
                        city_search = words_text_tweet[0]
                        print(city_search)
                        if(len(city_search)<25):
                            base_url_response = "http://api.openweathermap.org/data/2.5/weather?appid=" + API_key + "&q=" + city_search
                            busqueda = requests.get(base_url_response).json()
                            calor = (str(busqueda["main"]["temp"]-273.15))[0:4]
                            if (len(busqueda) == 2):
                                respuesta = (f"Lo siento {reply}, no he encontrado {city_search} en mi base de datos")
                                api.update_status(respuesta)
                            else:
                                tiempo_search = busqueda["weather"][0]["description"]

                                translation_search = translator.translate(tiempo_search, dest='es')

                                respuesta = (f"Hola {reply}, en {city_search} hay {translation_search.text} y la temperatura es de {calor} ºC, para obtener mas información pincha en {base_url_response}")
                                api.update_status(respuesta)
                        else:
                            pass
                    else:
                        pass
            except tweepy.RateLimitError:
                pass
            time.sleep(60)

It needs to be able to distinguish whether the interaction try was legit, it was a joke, or just someone trying to play around with it.

if(len(city_search)<25):
                            base_url_response = "http://api.openweathermap.org/data/2.5/weather?appid=" + API_key + "&q=" + city_search

If the interaction intent was legit it needs to extract the information from that tweet and make a search using the APIs and then post the responses right before the translation happens.

The Bot has yet many limitations, of course, let's think of it as a tool for me to learn Python and get better at it.

The first and most obvious limitation is the fact that it will only respond for cities that are less than 25 characters (this has reasoning that I will explain later) and the fact that it will only respond if you mention with cities of one word, or if it has more that one word you should put it all together (this doesn't have as much of reasoning).

The other (not so obvious) limitation is the uncertain behavior of the code, as it depends on many external places to do its job if some of these pages is not responding as it should and this situation is not foreseen as a try: except: block the code will just crash. Taking into account that it depends on 3 APIs all of them I am using the free version, of course, the rate limit is quite limited, so it might happen that it crashes now and then.

Why is the length of the city limited to 25 characters?

As most of you are aware of, people do bad stuff on the internet, and some of these people might be looking forward to destroying your very well thought Twitter Bot, in fact, some people warned me that they would do it, well now that the code is public it's very easy for them to do it (I guess, I don't really know) but I still tried to protect myself putting a limit on the number of characters the code can take to avoid code injection or shit like that. But, what is the rationale for choosing 25 characters and not 30 or 20?

Well as y'all know, I am learning data science, and I studied with pandas the number of letters cities have in their names, to do so I followed these steps:

I downloaded a database with names of cities (https://simplemaps.com/data/world-cities)
I plot a histograph of number of cities vs. number of letters of the name of that city.
After a bit of statiscal analysis I found out that you can search for more than 99% of cities in the world so I thought that cities like Llanfairpwllgwyngyllgogerychwyrndrobwllllantysiliogogogoch (https://es.wikipedia.org/wiki/Llanfairpwllgwyngyll) can pass without knowing how the weather is like over there.

That's about it for this post, I hope you liked it, I will keep working on the Twitter bot adding some of the features I listed above and I will keep you informed.

I will soon post something about https://en.wikipedia.org/wiki/Analytic_hierarchy_process_%E2%80%93_leader_example in case you want to check it out.

Achefe

Search This Blog

You can know talk to @CastroTiempo and not only that, it will respond you

You can now talk to @CastroTiempo Twitter bot, and not only that, it will respond you!

Implementing some simple interaction capabilities as the previous step to Natural Processing Language implementation, this time you can ask it how the weather is like in basically any city around the world.

Why is the length of the city limited to 25 characters?

Labels

Comments

Post a Comment

Popular posts from this blog

A first approach to IoT, connecting my 3D printer to the internet

Advent of Code, day 2