Skip to main content

You can know talk to @CastroTiempo and not only that, it will respond you

You can now talk to @CastroTiempo Twitter bot, and not only that, it will respond you!

Implementing some simple interaction capabilities as the previous step to Natural Processing Language implementation, this time you can ask it how the weather is like in basically any city around the world.

So I was tinkering about how I could improve the @CastroTiempo Twitter Bot as it had many flaws and it still does. The first ideas that came to my mind were the following ones:

  1. Forecasting capabilities
  2. Interaction capabilities
  3. Update the weather report it provides with the temperature (obviously) and whether it was raining there were clear skies or whatever.
  4. To put the data in a public access database so there's a repository of information that everyone can access and it might serve in the future to study the weather.
  5. To upload a picture of the city at the moment of the report, and to put that picture in the database altogether, this feature was actually in the previous version but due to some issues with the Raspberry Pi, I could not add it.
So well I decided to fix the second and third points which I thought were the more urgent ones. The base of it has been the previous code, the one you find here: https://achefethings.blogspot.com/2021/03/web-scraping-to-feed-twitter-bot.html. 

The second point was just a matter of finding somewhere where I can get that info, in my case I used the OpenWeatherMap API was rather simple.
The third point was the tricky one, but I (mostly) pulled it off, for it to work you need to mention the bot and add the name of the city you want to know the weather at.

This time the task was getting rather since many parts had to come together, namely:
  •  The Web Scraping part of it:
    • OpenWeatherMap API for the temperature the skies reports (https://openweathermap.org/)
    • AQI CN pure scraping for the other variables, mainly the air quality index and the contaminants (https://aqicn.org/map/spain/cantabria/castro-urdiales/)
from pprint import pprint
import tweepy
import requests
from bs4 import BeautifulSoup
from datetime import datetime
import pytz
import re
import googletrans
from googletrans import Translator
translator = Translator()


auth = tweepy.OAuthHandler("secret", "secret2")
auth.set_access_token("token1", "token2")

api = tweepy.API(auth)

API_key = 'open_weather_key'

city = "Castro-Urdiales"
base_url = "http://api.openweathermap.org/data/2.5/weather?appid=" + API_key + "&q=" + city
       weather_data = requests.get(base_url).json()

tiempo = weather_data["weather"][0]["description"]
        temp_curr = str(float(weather_data["main"]["temp"])-273.15)[0:4]
temp_curr_feels_like = str(float(weather_data["main"]["feels_like"])-273.15)[0:4]
temperature = str((f"La temperatura actual es de {temp_curr} ºC pero se sienten como {temp_curr_feels_like} ºC\n"))

temp_min = str(float(weather_data["main"]["temp_min"])-273.15)[0:4]
temp_max = str(float(weather_data["main"]["temp_max"])-273.15)[0:4]
min_max = str((f"La maxima y minima para hoy son de {temp_min} y {temp_max} ºC \n"))

hum = str(float(weather_data["main"]["humidity"]))[0:4]
pres = str(float(weather_data["main"]["pressure"]))[0:4]
hum_press = str(f"La humedad es del {hum} % y la presión atmosférica de {pres} hPa\n")

pattern = re.compile(r"(>\d{2}<)|(>\d{3}<)|(>\d{1}<)|(>\d{0}<)|(>-<)|(>\d{4}<)")
res = requests.get("https://aqicn.org/city/spain/cantabria/castro-urdiales/es/")
soup = BeautifulSoup(res.text, 'html.parser')

wind = str(soup.find(id="cur_w"))
wind_str = (pattern.search(wind)).group()
wind_deg = str(float(weather_data["wind"]["deg"]))[0:6]
        if(315 < float(wind_deg) < 360 or 0 < float(wind_deg) < 45):
viento = (f"Tenemos viento norte a {wind_str[1:len(wind_str)-1]} km/h\n")
elif(45 < float(wind_deg) < 135):
viento = (f"Tenemos viento del este a {wind_str[1:len(wind_str)-1]} km/h\n")
elif(135 < float(wind_deg) < 225):
viento = (f"Tenemos viento del sur a {wind_str[1:len(wind_str)-1]} km/h\n")
elif(225 < float(wind_deg) < 315 ):
viento = (f"Tenemos viento del oeste a {wind_str[1:len(wind_str)-1]} km/h\n")
else:
viento = ""


ACI = str(soup.find(id="aqiwgtvalue"))
ACI_str = (pattern.search(ACI)).group()
calidad = str(f"El nivel de Calidad del Aire es {ACI_str[1:len(ACI_str) - 1]}\n")
  • The translation part of it:
    • As the information grabbed is in English I had to use an API for translating the information and making special cases for the translation to fit the style of my reports (regarding grammar)
        translation = translator.translate(tiempo, dest='es')

if (translation.text == "cielo limpio"):
translation.text = "cielos despejados"
elif (translation.text == "pocas nubes"):
translation.text = "cielos poco nubladoss"
elif (translation.text == "nubes rotas"):
translation.text = "nubes dispersas"
elif (translation.text == "nubes nubladas"):
translation.text = "cielos nublados"
elif (translation.text == "lluvia ligera"):
translation.text = "lluvias ligeras"
elif (translation.text == "neblina"):
translation.text = "niebla"
elif (translation.text == "lluvia intensa"):
translation.text = "lluvias intensas"

t1 = str(f"Tenemos {translation.text}\n")
  • The interactive part of it:
    • The program needs to be able to read when someone mentions the Bot.
  i = 0
while(i<60):
i = i + 1
file_name = "twitter_mentions.txt"
try:
current_mentions = []
with open(file_name) as f:
for line in f:
line = line.strip("\n")
current_mentions.append(line)
with open(file_name,"a") as f:
for mention in api.mentions_timeline(count=1):
f.write("@"+mention.user.screen_name+" mentioned you: " + mention.text + "\n")
new_mentions = []
with open(file_name) as f:
for line in f:
line = line.strip("\n")
new_mentions.append(line)
current_mentions_set = set(current_mentions)
new_mentions_set = set(new_mentions)
new_mentions_list = new_mentions_set.difference(current_mentions_set)
for new_mention in new_mentions_list:
reply = new_mention.split()[0]
text_tweet = mention.text
words_text_tweet = text_tweet.split()
if (len(words_text_tweet)) == 2:
words_text_tweet.remove("@CastroTiempo")
city_search = words_text_tweet[0]
print(city_search)
if(len(city_search)<25):
base_url_response = "http://api.openweathermap.org/data/2.5/weather?appid=" + API_key + "&q=" + city_search
busqueda = requests.get(base_url_response).json()
calor = (str(busqueda["main"]["temp"]-273.15))[0:4]
if (len(busqueda) == 2):
respuesta = (f"Lo siento {reply}, no he encontrado {city_search} en mi base de datos")
api.update_status(respuesta)
else:
tiempo_search = busqueda["weather"][0]["description"]

translation_search = translator.translate(tiempo_search, dest='es')

respuesta = (f"Hola {reply}, en {city_search} hay {translation_search.text} y la temperatura es de {calor} ºC, para obtener mas información pincha en {base_url_response}")
api.update_status(respuesta)
else:
pass
else:
pass
except tweepy.RateLimitError:
pass
time.sleep(60)
    • It needs to be able to distinguish whether the interaction try was legit, it was a joke, or just someone trying to play around with it.
if(len(city_search)<25):
base_url_response = "http://api.openweathermap.org/data/2.5/weather?appid=" + API_key + "&q=" + city_search
    • If the interaction intent was legit it needs to extract the information from that tweet and make a search using the APIs and then post the responses right before the translation happens.

The Bot has yet many limitations, of course, let's think of it as a tool for me to learn Python and get better at it.

The first and most obvious limitation is the fact that it will only respond for cities that are less than 25 characters (this has reasoning that I will explain later) and the fact that it will only respond if you mention with cities of one word, or if it has more that one word you should put it all together (this doesn't have as much of reasoning).

The other (not so obvious) limitation is the uncertain behavior of the code, as it depends on many external places to do its job if some of these pages is not responding as it should and this situation is not foreseen as a try: except: block the code will just crash. Taking into account that it depends on 3 APIs all of them I am using the free version, of course, the rate limit is quite limited, so it might happen that it crashes now and then.

Why is the length of the city limited to 25 characters?

As most of you are aware of, people do bad stuff on the internet, and some of these people might be looking forward to destroying your very well thought Twitter Bot, in fact, some people warned me that they would do it, well now that the code is public it's very easy for them to do it (I guess, I don't really know) but I still tried to protect myself putting a limit on the number of characters the code can take to avoid code injection or shit like that. But, what is the rationale for choosing 25 characters and not 30 or 20?

Well as y'all know, I am learning data science, and I studied with pandas the number of letters cities have in their names, to do so I followed these steps:
  1. I downloaded a database with names of cities (https://simplemaps.com/data/world-cities)
  2. I plot a histograph of number of cities vs. number of letters of the name of that city.
  3. After a bit of statiscal analysis I found out that you can search for more than 99% of cities in the world so I thought that cities like Llanfairpwllgwyngyllgogerychwyrndrobwllllantysiliogogogoch (https://es.wikipedia.org/wiki/Llanfairpwllgwyngyll) can pass without knowing how the weather is like over there.


That's about it for this post, I hope you liked it, I will keep working on the Twitter bot adding some of the features I listed above and I will keep you informed.

I will soon post something about https://en.wikipedia.org/wiki/Analytic_hierarchy_process_%E2%80%93_leader_example in case you want to check it out.


:D






Comments

Popular posts from this blog

Advent of Code, day 1

 Day 1 of the Advent of Code A bit late for Christmas, isn't it? It's been a long time since I did my last post, some days of boredom have inspired me to get back at it. I am going to be trying to solve the last Christmas' Advent of Code. For those that you don't know the Advent of Code is an Advent calendar of small programming puzzles that can be solved the way you like, each day contains two puzzles in which the first is linked to the second. If you want to get more info about it, check out this link:  https://adventofcode.com/2023/about Without further ado, let's get at it, I'm going to copy down below the statement for the Day 1 Statement Input Basically we are given a long list of characters (the one shown in the picture) where each line contains numbers and letters, we first need to get just the numbers, then store somewhere else the first and last numbers in each row, and lastly sum all those values. A link to the input file:  https://adventofcode.com/20...

A first approach to IoT, connecting my 3D printer to the internet

My first approach to the IoT, connecting my 3D printer to the internet IoT is one of those fancy words that people like to talk about in conferences and in TedTalks without (apparently) having too much idea of what it is all about. Set up to manage the 3D printer through the internet This one is going to be a short entry where I don't go through code or anything, just wanted to talk about a bit about how I connected my 3D printer to the internet.  I've been in the 3D printing thing for a while now, about a year and I haven't stopped printing ever since I bought my Ender 3. Fortunately enough, I live in a big house where my room/working place is on the fourth floor and my 3D printing is on the first one. You might be thinking as well: "OK Pablo but where do you want to bring us? Go to the point" Well, as you might have noticed there are two floors in betw...

Advent of Code, day 2

We made it to Day 2 of Advent of Code I have kept my promise for 2 days in a row. As weird as it seems for this day and age, I have kept my promise and today is the day 2 of the Advent of Code, this time with elves involved. There you go, the statement of today's first problem down below: And usual, my solution: import pandas as pd import re import numpy as np filepath = "/content/input 1.txt" with open (filepath, 'r' ) as file :     lines = [line.strip() for line in file .readlines()] lines_split = [] for line in lines:   token = re.findall(r '\w+|[^\s\w]' , line)   lines_split.append(token) blue_list = [] green_list = [] red_list = [] for line in lines_split:   blue,green,red = 0 , 0 , 0   for ix, word in enumerate (line):     if word == "blue" :       if blue < int (line[ix -1 ]):         blue = int (line[ix -1 ])     if word == "green" :       if green < int (...