What does the 'u' symbol mean in front of string values? [duplicate]
Yes in short i would like to know why am I seeing a u in front of my keys and values.
I am rendering a form. The form has check-box for the particular label and one text field for the ip address. I am creating a dictionary with keys being the label which are hardcoded in the list_key and values for the dictionary are taken from the form input (list_value). The dictionary is created but it is preceded by u for some values. here is the sample output for the dictionary:
{u'1': {'broadcast': u'on', 'arp': '', 'webserver': '', 'ipaddr': u'', 'dns': ''}}
can someone please explain what I am doing wrong. I am not getting the error when i simulate similar method in pyscripter. Any suggestions to improve the code are welcome. Thank you
#!/usr/bin/env python
import webapp2
import itertools
import cgi
form ="""
<form method="post">
FIREWALL
<br><br>
<select name="profiles">
<option value="1">profile 1</option>
<option value="2">profile 2</option>
<option value="3">profile 3</option>
</select>
<br><br>
Check the box to implement the particular policy
<br><br>
<label> Allow Broadcast
<input type="checkbox" name="broadcast">
</label>
<br><br>
<label> Allow ARP
<input type="checkbox" name="arp">
</label><br><br>
<label> Allow Web traffic from external address to internal webserver
<input type="checkbox" name="webserver">
</label><br><br>
<label> Allow DNS
<input type="checkbox" name="dns">
</label><br><br>
<label> Block particular Internet Protocol address
<input type="text" name="ipaddr">
</label><br><br>
<input type="submit">
</form>
"""
dictionarymain={}
class MainHandler(webapp2.RequestHandler):
def get(self):
self.response.out.write(form)
def post(self):
# get the parameters from the form
profile = self.request.get('profiles')
broadcast = self.request.get('broadcast')
arp = self.request.get('arp')
webserver = self.request.get('webserver')
dns =self.request.get('dns')
ipaddr = self.request.get('ipaddr')
# Create a dictionary for the above parameters
list_value =[ broadcast , arp , webserver , dns, ipaddr ]
list_key =['broadcast' , 'arp' , 'webserver' , 'dns' , 'ipaddr' ]
#self.response.headers['Content-Type'] ='text/plain'
#self.response.out.write(profile)
# map two list to a dictionary using itertools
adict = dict(zip(list_key,list_value))
self.response.headers['Content-Type'] ='text/plain'
self.response.out.write(adict)
if profile not in dictionarymain:
dictionarymain[profile]= {}
dictionarymain[profile]= adict
#self.response.headers['Content-Type'] ='text/plain'
#self.response.out.write(dictionarymain)
def escape_html(s):
return cgi.escape(s, quote =True)
app = webapp2.WSGIApplication([('/', MainHandler)],
debug=True)
The 'u' in front of the string values means the string is a Unicode string. Unicode is a way to represent more characters than normal ASCII can manage. The fact that you're seeing the u
means you're on Python 2 - strings are Unicode by default on Python 3, but on Python 2, the u
in front distinguishes Unicode strings. The rest of this answer will focus on Python 2.
You can create a Unicode string multiple ways:
>>> u'foo'
u'foo'
>>> unicode('foo') # Python 2 only
u'foo'
But the real reason is to represent something like this (translation here):
>>> val = u'Ознакомьтесь с документацией'
>>> val
u'\u041e\u0437\u043d\u0430\u043a\u043e\u043c\u044c\u0442\u0435\u0441\u044c \u0441 \u0434\u043e\u043a\u0443\u043c\u0435\u043d\u0442\u0430\u0446\u0438\u0435\u0439'
>>> print val
Ознакомьтесь с документацией
For the most part, Unicode and non-Unicode strings are interoperable on Python 2.
There are other symbols you will see, such as the "raw" symbol r
for telling a string not to interpret backslashes. This is extremely useful for writing regular expressions.
>>> 'foo\"'
'foo"'
>>> r'foo\"'
'foo\\"'
Unicode and non-Unicode strings can be equal on Python 2:
>>> bird1 = unicode('unladen swallow')
>>> bird2 = 'unladen swallow'
>>> bird1 == bird2
True
but not on Python 3:
>>> x = u'asdf' # Python 3
>>> y = b'asdf' # b indicates bytestring
>>> x == y
False
This is a feature, not a bug.
See http://docs.python.org/howto/unicode.html, specifically the 'unicode type' section.